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ABSTRACT 

In cooperation with the Commission on Preservation 
and Access, Xerox Corporation, Sun Microsys terns , Inc., and the New 
York State Program for the Conservation and Preservation of Library 
Research Materials, Cornell University (New York) studied and 
established the ef f ect iveness of digital technology to preserve and 
make available research library materials, evaluated image capture 
quality in binary scanning, digital computer output microfilm, and 
extended network access to the Digital Library through a 
client/server architecture. The main conclusions of the project are: 
(l) effective access over the Internet to an image-based digital 
library can be achieved from a variety of workstat ions ; (2) Cornell 

has defined and will implement a digital document control structure 
that incorporates the best features of various Xerox prototype 
systems; (3) digital computer output microfilm that meets national 
standards for quality can be produced from 600 dpi (dots per inch) 
binary scanning; (4) binary scanning can reproduce many categories of 
printed i 1 lus trat i ons and archival material in a manner superior or 
comparable to the quality obtained with standard light lens photocopy 
and microfilm processes; and (5) the infrastructure developed for 
library preservation and access activities supports other 
applications in the electronic dissemination of information. Five 
appendices cover: the CLASS scanning system; document architecture 
description; testbed description; "DocuTech-pr int ed" examples; and 
screen descriptions from digital library UNIX client. (SWC) 
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I. Executive Summary 



Cornell has embarked on a program to encourage the use of 
digital technology to enhance access to library materials and 
to provide a new alternative for preservation reformatting of 
brittle library material. Phase I of this program, a joint study 
conducted by Cornell with the Commission on Preservation 
and Access and the Xerox Corporation, led to a number of 
conclusions regarding preservation, access, electronic 
technology, and the role of the library. In particular, Cornell 
established the effectiveness of digital technology to preserve 
and make available research library materials. 

In Phase II, Cornell extended its exploration of the use of 
digital imaging technology by establishing a Testbed, for 
evaluating both new uses of digital technology for library 
applications, and new technologies that may advance library 
preservation and access. The Testbed, as in Phase I a 
collaborative effort on the part of Cornell Information 
Technologies and the Cornell University Library, has been 
sponsored bv the Commission on Preservation and Access, 
with additional support from Xerox Corporation, Sun 
Microsystems, Inc., and the New York State Program for the 
Conservation and Preservation of Library Research Materials. 

At the conclusion of Phase II, a Testbed facility has been 
established, the quality of image capture capabilities 
associated with binary scanning further evaluated, the 
creation of digital computer output microfilm explored, and 
network access to the Digital Library extended through the 
development of a client/server architecture. 
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Main Conclusions 

1. Effective access over the Internet to an image-based digital library can be 
achieved from a variety of workstations. 

Phase I demonstrated the feasibility of remote access through the 
delivery of digital images over the Cornell network for printing and 
for viewing on a prototype workstation. Phase II development 
centered on defining the architecture and developing systems 
needed to support extended remote access to the digital library. 
Internet access is provided via a digital library server and software 
“clients" designed to run on standard desktop computers that 
students and faculty members would commonly have, such as 
Macintoshes, IBM PC's, and Sun workstations. The client software 
provides access across the Internet at speeds comparable to what is 
available locally. Client/server computing is an evolving 
application architecture that is expected to play a major role in 
providing network access to digital libraries across the country. 

2. Cornell has defined a document control structure that incorporates the 
best features of the various Xerox prototype systems from Phase I, and will 
maintain its digital library in that form. 

Through the implementation of document control structures, 
digital technology offers a means to facilitate access and to provide 
links between the bibliographic record and material located in the 
digital library. Preliminary experiments with network viewing of 
digital books has verified the early assumption that information 
about the internal organization of a document, its structure, is 
essential for ease of navigation from a viewstation. Cornell has 
defined a non-proprietary document architecture which 
incorporates the best parts of the various prototypes from Phase I, 
and has decided to maintain its digital library in that form. 

3. Digital computer output microfilm that meets national standards for 
quality can be produced from 600 dpi binary scanning. 

Cornell University, in cooperation with Image Graphics, Inc., tested 
the feasibility of producing microfilm from high resolution digital 
images by means of an electron beam recorder. Cornell evaluated 
the quality of the resulting film, computed its “digital resolution" 
based on a formula recently developed by an AIIM technical 
committee, and compared it to printers' type sizes used by 
publishers during the period 1800-1950. Based on these analyses, 
Cornell has concluded that a scanning resolution of 600 dots per 
inch is sufficient to produce digital computer output microfilm 
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(COM) that meets ANSI/AIIM standards for image quality for 
virtually all books published during the period of paper's greatest 
brittleness. 

Although the Cornell experiment demonstrated the technical 
feasibility of producing preservation microfilm, some of the issues 
surrounding quality, processing, costs, and vendor services 
associated with the conversion process have yet to be resolved. 
Cornell will continue its investigation into the use of digital 
technology to produce microfilm that meets preservation standards, 
while also allowing for the flexibility in storage, distribution, and 
access associated with the technology. 



4. Binary scanning can reproduce many categories of printed illustrations 
and archival material in a manner superior or comparable to the quality 
obtained with standard light lens photocopy and microfilm processes. 



Where Phase I focused on preserving brittle books that were largely 
textual, Phase II extended the evaluation to include a review of the 
applicability of digital imaging technology for printed illustrations 
and a wide array of archival material. Based on this 
experimentation, Cornell has concluded that binary scanning can 
result in the production of paper facsimiles for a wide range of 
material that are superior or comparable to photocopy versions. For 
some material, and for purposes other than printing, however, gray 
scale or color scanning may be more appropriate reformatting 
options. More experimental work is needed to examine the various 
tradeoffs associated with the use of gray scale and color scanning. 



5. The infrastructure developed for library preservation and access activities 
supports other applications in the electronic dissemination of 
information. 



The infrastructure created to support the Testbed provides the basic 
components for many electronic publishing applications and is 
designed to encourage widespread collaboration among institutions. 
Cornell University is presently conducting several collaborative 
projects for incorporating other material into the digital library, 
including current periodicals, dissertations, research reports, and 
newlv-published books. Discussions are underway with other 
institutions that would lead to the creation of union collections 
accessible in a common fashion over the Internet. 
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II. Testbed Projects 



Cornell University has established a testbed environment to evaluate, 
test, and advance the role of digital technologies in preserving and 
enhancing access to deteriorating library materials thereby providing a 
link between prototype technologies and activities and their translation 
into production services. The testbed is a corporate venture between 
Cornell University Library and Cornell Information Technologies. A 
description of the Testbed facilities, staff, and testing methodology is 
described in Appendix III. 

This testbed built upon the prototype activities conducted as part of the 
CLASS Project that was jointly supported by the Commission on 
Preservation and Access, Cornell, and the Xerox Corporation 1 . Indeed, 
the activities of the testbed have used the scanned CLASS materials as 
the archive to test both extended access to a digital library, and to 
experiment with the production of computer output microfilm that 
meets national presentation standards. Testbed projects include those 
with goals of : 

• Improving Access: Technologies for improving local and national 
network access to the digital masters were defined and 
implemented. Software development centered on the creation of a 
client/server architecture to provide remote viewing of digital 
material from common computer platforms. 

• Evaluating Storage Technologies: Technologies for storing scanned 
digital masters were assessed, including different file formats and 
compression techniques and storage technologies. Different storage 
architectures, including document structures and indexing 
techniques, were developed and tested that enable the storage of 
extremely large files of information while providing a high degree 
of precision in recall. 

• Evaluating Scanning Technologies: In cooperation with the New 
York State "Big 11" research libraries, Cornell experimented with 
binary digital technology for image capture of a wide variety of 
archival material Technicians also tested the CLASS system's 
capabilities to reproduce a variety of illustrations found in books 
published from 1850-1917. A third project centered on evaluating 
the feasibility of creating digital computer output microfilm to meet 
national preservation standards. 



A. Kenney and L. Personius, Cornell/Xerox/Commission on Preservation and 
Access Toint Study in Digital Preservation Report: Phase I (lanuarv 1990- 
Oer ember 1991) . Washington, DC: Commission on Preservation and Access, 1992. 
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A. Improving Access 

The digital library from Phase I of the joint study resulted in a 
prototype networked system for creating, storing, printing, and 
accessing an electronic library. This system allowed for the distributed 
scanning, printing , and storing of the digital library to a number of 
locations served by a high speed network. During Phase II Cornell has 
expanded access to the digital library both from Cornell locations and 
over the Internet. The development of a client/server environment 
permits images to be delivered over networks to a variety of hardware 
platforms, and facilitates viewing the images of library materials from 
local workstations. 

The components of this portion of the project include: 

• Providing a browsing capability from any workstation to descriptive 
information about digital documents in the Testbed digital library, 
and facilitate requests for the delivery of printed copies. 

• Developing an image delivery server that translates images stored 
in the Testbed digital library to selected other storage formats and 
compressions for transmission to a variety of systems. 

• Delivering digital images to common workstations including Sun 
workstations, IBM PS/2 computers, and Apple Macintoshes, and 
evaluating the quality of interface software and of the onscreen 
display. 

The prototype digital library as developed by the Xerox Corporation 
consisted of three top level functional modules. 

Creation - Documents are added to the digital library at Cornell using a 
scanning workstation and sophisticated software. The actual 
scanner and the software are Xerox products. They have been 
described in earlier reports of this project, and detailed specifications 
are available from Xerox Corporation. Using this system, 
technicians scan books and then request that the books be filed into 
the digital library. 

Storage - The Xerox system stores the digital library as TIFF Group 4 
image files of scanned material into an image filing system. Cornell 
has used a test system for this project, which now must be replaced. 

Printing - A network connected DocuTech printer provides the ability 
to create high quality paper facsimiles at high speed from the digital 
library. Material can be sent to the printer from the creation 
workstation. 
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Cornell perceived a need to add two capabilities to the digital library: 
(1) a browsing capability, and (2) a mechanism to manage the use of the 
library. The initial features of the browsing capability include: 

— The ability for a library patron to read digital books from home or 
office using the same network-connected workstation that is used 
for other purposes. No new or specialized hardware should be 
required. 

— The ability to request a printed copy of all or of selected sections of 
one or more documents. 

— The ability to protect the digital library from either accidental or 
deliberate modification by public users. 

The Cornell digital library is based on the client/ server model of 
computation. Initially, Xerox staff worked closely and exclusively with 
Cornell University Library staff to define the requirements for library 
preservation and access. These requirements included storing both 
low- and high-resolution versions of images, so that the low- 
resolution images could be used for browsing over the network and the 
high-resolution images could be used for printing and storing. In 
addition, substantial work was done to define documents with internal 
structures that could be navigated. As part of the Testbed Project, 
Cornell developed complementary software to allow library users to 
browse the documents (including navigating the internal document 
structure) and request printed copies over the network. The Cornell- 
developed access software consists of an image delivery server and 
browsing clients for three common user workstations described below. 
It is freely distributed to institutions working on imaging projects. 



The Image Delivery Server 

The largest component of this project has been the development of an 
image delivery server connected to the campus network which allows 
images to be read from the image filing system and converted so that 
they can be sent out in revised formats. The image delivery server is 
UNIX application running on a Sun SPARCstation workstation. 

A significant product of this project has been the definition and 
implementation of a protocol to be used between the image delivery 
server and the client software used for viewing digital files. 

Many of the functions of the original request server from Phase I have 
been included in the image delivery server. At the completion of Phase 
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I, a prototype request server was in place. 2 The image delivery server 
allows a researcher to make a print request using client software that is 
now readily available for most workstations. 

The digital library clients request information from the server over the 
network and present it to the users. The Image Delivery Server 
manages access to the digital library, which includes searching 
databases, parsing document structures, scaling and rotating images, 
and finally packaging and sending the desired image or information on 
to the client. The client/server model allows for unlimited 
expandability. Servers can be added as needed to support user clients 
and to communicate with other image delivery servers once 
appropriate protocols have been developed and implemented. 

The tasks for which the server is responsible include image scaling, 
rotation, decompression and compression. Processing requirements are 
determined bv "handshaking" with the client. The Sun workstation 
client is capable of decompressing an image, so the images can be sent 
in compressed form for improved network performance. Clients 
running on PC's and Macintoshes require more support from the 
server. Images are stored in a compressed format, and must be 
uncompressed, scaled or rotated (as necessary), and possibly 
recompressed before being sent across the network. 

Material for included in the digital library is scanned at 600 dpi. The 
current server software does not support scaling and rotation of images 
due to memory and processing constraints (the time to rotate and scale 
600 dpi images is on the order of minutes). To provide fast access, 100 
dpi "thumbnail" images are stored on local magnetic disks. The high 
resolution images are stored and used for printing facsimiles of 
documents and for scaling purposes. Thumbnail images are delivered 
to the clients for browsing and viewing in real time. 

The quality of the on-screen display for most of the material scanned 
was acceptable, with the exception of halftone images. Because of the 
use of screens, which provide a dot structure to duplicate the original 
halftone, the scaling down of these images for screen viewing causes a 
distortion of the images. To achieve a proper viewing resolution, the 
halftone scanned images would have to be presented at full size, which 
would be approximately six times the size of a 23" monitor. As is well- 



At the completion of the Cornell/Xerox/CPA Joint Study, a prototype request 
server was in place designed to permit any workstation on the Cornell Campus 
Network running X-windows software to request a printed copy of a digital image 
or a set of images bv means of a document structue file that provides users with a 
description of the contents of a volume and the means for requesting specific page 
sets. The specification for that server is included as Supplement III of The 
Cornell /Xerox /Commission on Preservation and Access ToitLt Study in Digital 
Preservation Report: Phase I flanuarv 1990- De cember 1991). 
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known, for viewing purposes low resolution images with some grey 
scale displayed will provide the best combination for on screen 
viewing, however, binary formats are best for production printing 
purposes. 

The image delivery server starts a new copy (instance) of the server 
software for each client that connects. The memory used averages 
about 1 Mbyte per client. When a client opens a document, the image 
delivery server parses the document description files and initializes 
necessary structures. Each document opened requires about 1 Mbyte 
additional memory, depending on the size and complexity of the 
document. The maximum number of documents for each client is 
currently 3, limited by the hardware resources available. Each server 
requires roughly 3.5 Mbytes (with two documents open). If image 
rotation and scaling is required, additional memory is used. 

The current demonstration system is a Sun SPARCstation 2 with 32 
Mbvtes of memory, running SunOS and OpenWindows. The Digital 
Library search database, a Gupta SQLbase server which uses 5 Mbytes of 
memory, also runs on this machine. After starting OpenWindows 
and Gupta SQLbase approximately 12 MBytes are available. Under 
these circumstances the current image delivery server hardware can 
support only 3 simultaneous users. In the future the digital library will 
need be upgraded to support significantly more users. 
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Description of Viewing Clients 

Clients access and browse the Cornell digital library, allowing users to 
search for documents by author, title, or catalog id, and then to traverse 
the document structure, e.g., by chapter or article. Pages may be viewe 
from one book, or multiple documents may be opened for viewing 
simultaneously. 

Client capabilities included: 

Searching — After starting the client, a document window is visible 
which includes buttons for searching, printing, and navigating. The 
magnifying glass button or the Search menu item starts a search. 
The user is given a dialog box with a popup menu to limit the 
search to either author, title, or catalog. All the books currently 
available will be retrieved if the field is left blank, or Find All 
button is selected in the popup menu. 
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Browsing — Once the search returns a document or list of documents 
the user makes a selection. The client then retrieves the next level 
of the document structure which might include chapter headings or 
pages or other labels depending on how the document has been 
structured. By traversing successive levels, the user eventually 
comes to a screen where the first page of a selected section is 
displayed in the bottom portion of the document window. The user 
may then browse through the document. 



Cornell Digital Library 
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ASTRONOMY FOR ALL 



CHAPTER I 



A GLIMPSE INTO INFINITY 



T HE waves lap monotonously against the shore, croon- 
ing a melancholy cradle-song. A belated sailing-boat 
heads towards the little fishing-village from which the 
evening wind carries the sound of a dog's bark across to me 
sitting alone in the gloaming. The last golden tints are 
dying in the west; tiny clouds float in a roseate sea which 
dims gradually as the sun travels rapidly away to the coun- 
tries far westward, where the long night is almost past. The 
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FIGURE 2. ASTRONOMY FOR ALL PAGE DISPLAYED IN BROWSING WINDOW 
APPLE MACINTOSH DIGITAL LIBRARY CLIENT 
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Printing — A user can generate a print request either by clicking on the 
printer icon in the control area or by choosing the Print menu item. 
A dialog box will be displayed which informs the user of the 
number pages selected for printing. For instance, if the user has a 
chapter heading selected and clicks on the print button, the client 
determines the number of pages to be printed in that chapter and 
informs the user of the total print request. The output is then sent 
to the Xerox DocuTech printer on the Cornell University campus 
for printing at 600 dpi resolution, and delivered to sites on campus 
via campus mail. 
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ter III. 
ter IV. 
ter V. I 
ter VI. 



This will print to the Heron Docutech Printer at Cornell Uniuersity. 



10 Pagefsl will be printed and mailed only to a Cornell address. 
It will be billed to: And mailed to: 



Enter billing name here 



at 



10 



cents per page 



Enter Cornell Account Number: 



Enter mailing name here 



Enter address 



Cornell Uniuersity 



Ithaca; NV 14851 



Cancel 



OK 



FIGURE 3. PRINT DIALOG, APPLE MACINTOSH DIGITAL LIBRARY CLIENT 

An example of printed pages from this book is included as Appendix 
IV. 



Delivery of Images to Workstations 

Client software developed for three workstations (IBM-compatible PC's 
with VGA and running Windows; Macintoshes; and Sun 
workstations) provides access to the Testbed digital library via the 
image delivery server. The graphical interface varies from one 
platform to the next, but the functionality is equivalent. Appendix V 
contains samples from a digital library session using the UNIX client 
running on a Sun Sparcstation. 

The client software has been tested across the Internet and provides 
access at speeds comparable to what is provided at Cornell. The 
client/server architecture is now being tested by the Big 11 Research 
Libraries in New York State who use the various Cornell developed 
clients to access books in the digital library and to test the print on 
demand capabilities. 

The Macintosh client includes the ability to capture books as 
QuickTime movies. QuickTime is a new extension to the Apple 
Macintosh system that supports the incorporation of animation, sound 
and video into Macintosh files, and provides image compression 
capabilities. In the Macintosh digital library client, QuickTime is used 
for viewing, scaling, and compression of the images, and to save a 
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document or portions of a document as a QuickTime movie for local 
storage. 
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Beyond the Testbed — Plans for Continuing Development 

As Cornell worked on the design for the digital library, it was 
anticipated that additional management functions will be necessary to 
provide a fully functioning digital library. In particular, access to the 
digital library will need to: 

— Record charges for printed copies of material, and to bill the 
requester. 

— Know if a particular book is covered by copyright, and if so, record 
royalties for the use of that work. The appropriate interface needs to 
be designed so that charges can be transferred appropriately for use 
of material. 

— For some library material, check that the reader is authorized to 
view documents. An early example of the need to authorize users 
was presented by a journal publisher who contracted to provide 
material, stored in the digital library, to some but not all categories 
of users. 

— Authenticate users. In order to charge readers or authorize them to 
comply with contractual arrangements, the system must be sure 
that readers are who they claim to be. 

— Track use in order to document how and when the digital library is 
used. This will assist librarians and scholars in assessing the impact 
of creating the digital library on the research process. 

The next phase of client development will include the addition of a 
WAIS interface to search text abstracts which will then link to the 
digital documents. User authentication will be implemented using 
Kerberos to verify users according to copyright licenses and affiliation 
with the Cornell community. Bookmarks will be developed so that a 
user may set a bookmark for fast access to particular locations in a 
document. 

To guide the thinking about new methods of access to Cornell material 
in the digital environment, two forums have been established. The 
Cornell University Library Priority Committee, a group composed of 
the Library's administrative heads and members of the Library 
Technology Department and chaired by Catherine Murray-Rust, 
Associate University Librarian, meets regularly to establish policy and 
determine how resources are to be allocated to technology-related 
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work. The Priority Committee is actively reviewing the opportunities 
and challenges of adding provisions for digital and electronic resources 
that are to become part of the collection of the Cornell Library. A white 
paper is expected later in 1993. 

A second University task force is evaluating digital technology as a wav 
to provide access to Cornell collections beyond the library. The 
members of the Task Force on Digital Access to Cornell Collections, 
chaired by Tom Hickerson, share an equal concern for the 
development of systems for the effective management and 
preservation of collections and for the design and implementation of 
systems which enhance and broaden access and use of these collections. 
In the expansion and application of digital technology, new capabilities 
for high-quality scanning and printing, high-density compression and 
storage, and high-speed networks offer remarkable new opportunities. 
The mission of this task force is to explore and capitalize on these 
opportunities. 
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B. Evaluating Storage Technologies and document Structure 
Move from proprietary to UNIX file system 



After evaluation and testing both at Cornell and at Xerox's internal 
facilities, Xerox decided not to use the prototype CLASS system 
developed and tested with Cornell. However, Xerox has plans for their 
product to evolve to meet the standards developed for this project. 
Cornell plans to implement it as soon as: 

• An export function is provided so that books can be brought out and 
put in Cornell format 

• The scanner incorporates certain functions that are required for 
compatibility with the CLASS system. 

Cornell is using Xerox software and hardware for scanning, but does 
not intend to use the Xerox system for archival storage. The Xerox 
system w r as not intended for large libraries. The document structure 
files that associate individual files for pages into logical documents of 
books, journals, or archival collections are not a widely acceptable 
standard and not sufficiently accessible to meet the needs of libraries. 
The Xerox Corporation supported Cornell's evaluation of hardware 
and software options for the digital library files. A prototype Xerox 
scanner with associated software has been used throughout the digital 
preservation study, both Phase I and Phase II. Cornell intends to 
continue using the prototype because it contains functions that have 
not yet been included in the product version. However, Cornell has 
been informed that the product will evolve to include these, and plans 
to convert to the product as soon as these features are available. 

A primary goal of the Testbed was to create and maintain a growing 
digital library of scanned documents that remains current and broadly 
accessible in the face of changing technologies, and compatible with de 
facto and de jure standards. Keeping this in mind, Cornell decided to 
use a standard UNIX file system. This provided great flexibility in the 
choice of actual hardware. Since many types of physical devices are 
supported by the UNIX operating system, optical storage can be used 
where appropriate, or magnetic storage as needed for performance. 
Cornell has supplied mass storage for the digital library. 

Cornell has scanned over 1000 books which require more than 30 
Gbytes of storage. The 600 dpi images are stored on a magneto /optical 
jukebox mass storage device, which has great capacity but relatively low 
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performance. To store the 100 dpi images for these books on local disks 
for good performance will require 5 Gbytes of disk space. 



Status of Digital Library Document Architecture 

Documents in the digital library are flexible and easy to use in a way 
analogous to a book. Images are linked together into documents 
reflecting the structure of the originals. Browsing is greatly enhanced by 
the use of this structure. Chapter headings, the table of contents, indices 
and the like can be easily located and used to navigate through the 
document. 

Cornell's intent in establishing a digital library has been to provide 
wide access and to facilitate experimenting with different means of 
access. This was accomplished by maintaining a simple and open 
document architecture to which other software could be easily adapted. 
A non-proprietary document architecture which incorporates the best 
parts of the various versions of CLASS has been defined by Cornell. 
Software was written which would access the digital library without 
using any proprietary software, resulting in access software which 
could be freely distributed. Access to documents in the digital library 
has been provided using such standard network tools such as FTP and 
Gopher. 



Why a Document Architecture? 

Just as a conventional library contains books rather than pages, so the 
electronic library must contain documents rather than images. To 
organize groups of images into useful documents, a Document 
Architecture has been defined. This is a critical component of the 
electronic library, and Cornell recommends that it be further defined 
and standardized to facilitate sharing of electronic documents among 
institutions. 

During the scanning process, images are automatically linked into 
documents by creating document structure files that order the image 
files in the same way the binding of a book orders the pages. Thus, the 
digital book as currently configured consists of two parts: a set of 
individual pages stored as discrete bit map image files, and the 
document structure files which "bind" the image files into a document. 
In addition, a database entry is made for each digital document which 
permits searching by author and title (i.e., bibliographic information). 

Beyond the order of the pages, the arrangement of a physical book 
provides information to readers. The title page and publication 
information come first; the table of contents usually precedes the text; 
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the text is divided into sections or chapters; if there is an index, it 
follows the text. The reader often refers to these components of a book 
when browsing the library shelves, in order to determine whether the 
book meets their needs. The document structure provides direct access 
to the components of an electronic document, storing the information 
that would otherwise be lost when the book is disbound for scanning. 

Preliminary experiments with network viewing of digital books has 
verified the early assumption that information about the structure of a 
book, beyond what is customarily included in a bibliographic record, is 
necessary for ease of use. For example, the page numbers printed in the 
original book must be incorporated into the document structure and 
correlated with the image files so that a request to retrieve a particular 
page number can be met with the image with that number printed on 
it. 

The creation and storage of the document structure is critical to the 
system design. Requirements for the document follow. 

Cornell recommends a collaborative process involving other 
institutions and consortia to define further the application and utility 
of the document structure and to establish it in a standardized form 
across the digital libraries of multiple institutions. 



Document Architecture Requirements 

1. The architecture must be open. “Open" in this sense means that the 
specification is published and freely available, and may be used by 
anyone without paying any royalties. 

2. The architecture should be as simple as possible. The intent is to 
facilitate development of products using the architecture. 

3. The architecture should assume that data is stored in UNIX file 
systems. 

4. The architecture should not preclude use of the data in other 
standard ways, such as via FTP and Gopher servers. This means 
that the files containing the pages of a document must exist in a 
single directory, and the naming convention used must order them 
in the standard collating sequence (i.e., the series “0001.TIF, 
0002.TIF, ..., 0411.TIF" is acceptable, while the series "l.TIF, 2.TIF, ..., 
411. TIF" is not, as the latter. would appear as "l.TIF, 10.TIF, 1 l.TIF, 
...) 
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5. The architecture should provide for storing the same information 
in different formats. For example, when a page of a document is 
available at several different resolutions or in ASCII as well as 
image form, these should be stored as separate files within the same 
document. In keeping with #4 above, each format may (and 
perhaps should) reside in a separate subdirectory, so that all image 
files are together, all ASCII files are together, etc. 

6. Low-resolution "thumbnail" images of each page must be stored to 
facilitate browsing and sharing of data. At present, the desired 
resolution for thumbnail images is 100 DPI. 

7. The architecture must support distribution of files so that similar 
files may be stored together, permitting optimization of storage use 
and performance. For example, it must be possible to specify that all 
100 DPI thumbnail images be in a certain directory. This also 
supports use of the data in other wavs, as the directory name may be 
used to describe the format of the data. 

8. The architecture must support documents that are composed of 
references to all or part of other documents. For example, it should 
be possible to create an anthology by excerpting portions of other 
documents without making physical copies of the images, and it 
should be possible to build up a journal from separate articles. Such 
documents should require only additional document structure files 
and database entries. 

9. The architecture must support documents, components of which 
are stored on separate servers distributed across the network. 

10. The architecture must support not only an hierarchical structure for 
each document, but the ability to define multiple views of each 
document. Secondary views should be able to contain pages in 
different order from the primary view, and should be able to 
exclude selected pages, fdowever, inclusion of additional pages 
would mean creating a new document. 

11. The architecture should accept, rather than dictate, directory 
structures in which documents will be stored. This will permit 
documents created in other ways to be added to the Digital Library 
simply by adding database information rather than by copying or 
moving files. 

A description of the document architecture recommended by Cornell is 

included in Appendix II. 
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C. Evaluating scanning Technologies 

1. Establishment of Digital-to-Microfilm Feasibility 



One of the main objectives of the Testbed Project was to determine the 
feasibility of using digital image technology to produce computer 
output microfilm (COM) that could meet national technical 
specifications for preservation. These specifications cover a wide range 
of issues, including: the preparation of documents; composition of the 
film stock; quality of image capture as defined by reduction ratio, image 
placement, resolution, and density; film processing; and storage.’ 

In the fall and early winter of 1991/92, Cornell tested the conversion 
process for producing computer output microfilm from digital images. 
Although there are a number of companies in the United States 
offering digital to microfilm conversion, Cornell located only one. 
Image Graphics, Incorporated of Shelton, Connecticut, that was 
prepared to handle high-speed, high-resolution film recording. Since 
1974, Image Graphics has become a leading developer of electron beam 
technology for government and industry. Electron beam recorders 
(EBR) record directly from digital data to film, and, according to Image 
Graphics, the EBR provides ten times better resolution, speed, and 
dvnamic range than conventional cathode ray tube, laser, and 
photomechanical imaging devices. 



Document Preparation 

Image Graphics agreed to run a test conversion of Cornell's digital 
images. The files for one volume were copied at Cornell onto magnetic 
tape along with AIIM Scanner Test Charts (as cited in ANSI/AIIM 
MS44-1988, "Recommended Practice for Quality Control of Image 
Scanners). The volume, entitled The Steam Turbine: The Rede Lecture 
1911 . by Sir Charles A. Parsons, contained halftones and line drawings 
embedded in text. The volume had been scanned one page/digital 
image using an earlv version of the CLASS scanner software at 600 dpi 
resolution. 



Film Stock 

The magnetic tape was sent to Image Graphics where the digital images 
were recorded on a MICROGRAPHICS EBR SYSTEM 3000, an electron 



For a fuller account of the digital-to-microfilm analysis, see A. Kenney, "Digital 
to microfilm conversion: an interim preservation solution," Library Resources & 
Technical Services. Vol 37, No.4 (October 1993.) 
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beam recorder used to produce microfilm from high resolution digital 
images and gray scale. The microfilm output from the digital files was 
produced on 35 mm, non-perforated Kodak Direct Electron Recording 
Film, SO-219. 

The SO-219 film is a silver-gelatin microfilm designed expressly for use 
in recorders that expose film by means of an electronic beam brought to 
bear directly on the emulsion. The film emulsion layer is unusually 
thin and characterized by extremely fine grains and a relatively high 
silver to gel ratio; the support is ESTAR base, a clear 4 mil polyester 
film. Because of uncertainty regarding the longevity of the SO-219 film, 
Cornell requested that Image Graphics test a film that is in wide use in 
preservation microfilming. The company was able to produce a sample 
strip of several images using Kodak ImageLink HQ. 



Processing 

Image Graphics wet-processed the film in an Oscar Fischer processor, at 
a speed of five to seven feet/minute. The company used Kodak 
Ultratek developer at a concentration of ten to one. The film was 
rinsed with tap water filtered through a one micron filter, then double 
fixed, using Kodak Rapid Fix with Hardener, and rinsed in a final bath 
of filtered tap water. 

Cornell sent samples of the SO-219 and the ImageLink HQ films to 
Biels Microfilm Corporation of Buffalo, New York where a methylene 
blue test (as defined in ANSI PH 4.8-1985) was conducted to determine 
the amount of residual thiosulfate concentration. If fixing and washing 
are inadequate, thiosulfates, or silver salts, or both, will be retained by 
the film. These can break down, especially under poor storage 
conditions, to produce yellow staining in clear areas and fading in areas 
containing image silver. 

The test results indicated unsatisfactory levels of chemical residue 
remaining on the film. Concentrations of less that 1.4 mg indicate 
archival quality. The Kodak ImageLink HQ had an actual 
concentration of 2.4 mg; the Kodak Direct Electron Recording, SO 219, 
had an actual concentration of 2.5 mg. Both films failed the methylene 
blue test. 

An inspection of the SO-219 film over a lightbox revealed additional 
problems associated with the processing of the film. The iilm was 
dirty, dusty, slightly scratched, and there were frequent splotches along 
the edges, indicating that the film had not been properly washed and 
dried. While the results reveal improper processing of the film on the 
part of Image Graphics, they do not have a direct bearing on the process 
of digital-to-microfilm conversion. Once the film is created, the 
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method of processing is identical to that used with conventional film, 
and, with proper handling, the film should meet standard preservation 
specifications. 



Density, Image Placement, and Reduction Ratio 

Image Graphics produced both a negative and a positive version of the 
film. Density readings taken on the negative averaged 1.21 which falls 
within the national guidelines for background density ranges for 
master negatives. However, it should be noted that, unlike 
conventional micrographics, film density for digital COM is a software- 
controlled variable within the EBR System 3000 recorder. Image 
Graphics could have adjusted the density to any level. 

Reduction ratio and image placement are also controlled by system 
software. Prior to recording the digital images on film, Image Graphics 
had rotated the images for film placement of two images /frame in the 
cine position at 10X reduction ratio, based on Cornell's specifications. 
While this conversion was successful, the light box inspection of the 
positive film revealed that the spacing between frames was too wide, 
which was attributed to incorrect software calculations for image 
placement. Further, the white background against which the images 
were placed made it difficult to discern the edges of individual pages, 
and this too would need to be addressed by software programming. 



Photographic Resolution and Quality Index (QI) 

A standard microfilm version, including three generations of film, of 
The Rede Lecture was prepared by Cornell Photo Services for 
comparison purposes. The positive copies produced by the two 
technologies were reviewed on identical side-by-side microfilm readers 
and compared to the original volume. Staff members who performed a 
frame-bv-frame comparison could discern no difference in the capture 
of text and line art between the two films. In both cases the images 
appeared crisp, with sharp contrast between text and background. 

Staff members also examined the IEEE Scanner Test Chart under 100 X 
magnification. The microscopic inspection of the technical targets 
reproduced there revealed that the resolution readings taken on both 
the positive and negative versions of the digital COM were identical, 
indicating no generational loss between copies. The same resolution 
readings were also achieved when the image of the test target was 
displayed at 600 dpi on the computer monitor, indicating negligible or 
no generational loss between the digital file and the COM. In contrast, 
there was a drop of two readings from the archival master to the 
service copy in the conventional film. 
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Quality Index (QI) is a means for relating resolution and text legibility, 
based on the measurement in millimeters of the height of the smallest 
significant character, usually the lower case letter "e," as read from the 
original document. This measurement (h) is then multiplied by the 
smallest resolution pattern that is resolved on the film (p), and the 
resultant number is used as the Quality Index (QI), e.g., p X h = QI, or 
6.3 X .8 = 5.04. Quality Index is described in detail in ANSI/ AIIM MS23- 
1991. The standard states that. a QI of 5.0 is considered medium because 
all alphanumeric characters are readable without difficulty, but serifs 
and fine detail may be lost. A QI of 5 has also been determined to be 
acceptable quality for applying Optical Character Recognition (OCR) 
software. A QI of 8.0 is considered excellent because serifs and fine 
detail are resolved. While the ANSI/AIIM standard indicates that a QI 
of 5 is acceptable, the Research Libraries Group, Inc. requires a QI of 8 
for its preservation microfilming projects. 

The resolution readings on the digital COM measured 6.3 line pairs per 
millimeter (1pm) for the equivalent of the third generation service 
copy, indicating a Quality Index rating of High Quality (8.0) for the 
master negative for material where the smallest "e" is 1.3 mm (for 6.3 
lpm), and a Quality Index Rating of Medium Quality (5.0) for material 
where the smallest "e" is .8 mm (for 6.3 lpm). Conventional 
microfilming is superior at capturing resolution: readings on the 
archival master were 10 lpm, indicating a Quality Index rating of High 
Quality for material where the smallest "e" is 1 mm high. To achieve 
the resolving power of standard micrographics would require scanning 
at about 1,000 dpi. 

Digital Resolution vs. Photographic Resolution 

Although the resolution reading on the digital COM measured 6.3 lpm, 
caution should be used in equating digital resolution to photographic 
resolution. In fact, the ANSI/AIIM MS44-1988 standard, 

"Recommended Practice for Quality Control of Image Scanners," 
advises against the use of resolution test patterns for scanners with a 
resolution less than or equal to 600 dpi because of "(1) the problems 
associated with the random placement of samples, and (2) the 
conflicting requirements placed on the threshold." 

The February 1992 draft AIIM technical report on photographic and 
electronic imaging resolution suggests an alternative method for 
determining Quality Index for digital imaging that takes into 

consideration the probability of misregistration. The proposed Quality 
Index equation for digital resolution (Rd) of a scanner is: 
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In this equation, the traditional QI, which assumes resolution units of 
line pairs per millimeter, was doubled because digital resolution 
measures just dots or lines. The height (h) of the smallest "e ' is defined 
in inches in the tutorial. To convert millimeters to inches, the figure h 
is multiplied by .039. To compensate for possible misregistration in the 
scanner, the total figure is increased by 50%, or multiplied by 1.5. 

To determine the height of the smallest “e" that can be resolved 
digitally, the formula becomes: 

h=[((2 X QI)/Rd)/.039] X 1.5 

The digital resolution (Rd) of the CLASS scanner used equals 600, and 
if the QI were to equal 8, the maximum height that can be resolved 
would be: 



h=[((2 X 8)/600)/.039] X 1.5 



which equals 1 mm. 

If a QI of 5 (medium) were acceptable, the height of the smallest "e" 
that can be resolved would equal: 

h=[((2 X 5) / 600) / .039] X 1.5 



or .6 mm. 
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Practical Application for 600 DPI Scanning 

Given the CLASS scanning system's technical capability to capture text, 
one must then determine its practical application in a preservation 
context. The question becomes, is 600 dpi scanning adequate to capture 
the range of printing found in brittle books? One way to answer this is 
to translate Digital Quality Index requirements into printing type size 
for material published since the mid-nineteenth century. 

One hundred and five measurements of the height of the smallest "e" 
at 9, 10, 11, 12, and 14 points were taken for a variety of typefaces used 
in the nineteenth and early twentieth centuries. These point sizes 
were commonly used for printing the body of texts. Typical x-heights 
in the 9 to 14 point sizes ranged from 1.34 mm to 2.15 mm, which are 
easily rendered by 600 dpi scanning. The x-height of the main text in 
The Rede Lecture 1911 . for example, measured 2.1 mm, which would 
require a scanning resolution of 300 dpi to achieve a Digital QI of 8. 

Some charts, formulae, diagrams, illustrations, advertisements, and 
' footnotes are printed in smaller types, and to be captured successfully 

30 
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will require higher scanning resolutions. Modern phototypesetting, 
introduced in the mid-nineteen sixties, offers a range of standard sizes 
from 4 or 5 points to 72 points. By contrast, items printed in the 
nineteenth century and first half of the twentieth century — the period 
during which most brittle books were published — were set by hand or 
by casting (monotype or Linotype), using metal type. With metal type, 
the ink has a tendency to spread, making the edges of the letterforms 
uneven, and so there was a limit to how small a typeface could be 
effectively printed or letters spaced. Metal type was commonly 
produced in sizes ranging from approximately 5 point to 72 point, with 
6 point being the smallest point size for most typefaces. 

In a sample of 26 type faces used from the mid-nineteenth century on, 
the x-height for 6 point type ranged from .9 mm to 1.4 mm, with the 
average measuring 1.17. Ten examples of 5 point text were available, 
and the x-height measured from .9 mm to 1.1 mm, with the average 
being .98. In The Rede Lecture 1911 . the x-height measured 1.5 mm for 
captions and .9 mm for text used in diagrams and charts. Based on 
these figures and the Digital Quality Index measurements noted earlier, 
600 dpi scanning should render the complete range of metal type in 
common usage from the nineteenth century to the mid twentieth 
century. 



Visual Inspection 

While the Digital Quality Index provides a useful means for translating 
between digital and classical resolution, the authors of the tutorial on 
photographic and electronic imaging resolution recommended 
strongly that users verify the quality of image capture by visually 
examining the scanner's output. Samples of the 5 and 6 point type were 
scanned, printed on paper, and examined both with the naked eye and 
under an eye loupe. The quality of the reproductions was uniformly 
high. All were easily readable, with serifs and fine details rendered. 

Some modern technical literature is printed at 4 point type, and some 
charts and diagrams used in books from the past century and a half 
contain handwritten characters that are below 5 point type in size. 
Measurements of 8 typefaces at 4 point ranged from .6mm to .9mm, 
with an average of .72 mm. The height of the smallest handwritten 
character in a line drawing from The Rede Lecture measured .55 mm, 
and was legiblv produced. According to the Digital Quality Index rating 
for 600 dpi scanning, 4 point text will be rendered with medium quality 
(QI=5). An examination of a number of scanned samples of 4 point type 
under magnification revealed that while all alphanumeric characters 
were clearly readable and distinct, some type with serifs or in italic was 
more difficult to capture than sans serif or Roman type. 
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Based on these experiments, it appears that the resolution power of 600 
dpi binary scanning will produce film that meets RLG s specifications 
for excellence (QI=8) for most text printed at 5 point type and larger, and 
it can be used to produce microfilm that meets the ANSI/ AIIM MS23- 
1991 standard for acceptable quality (QI=5) for text printed at 4 point 
type and larger. 

Given the CLASS scanning system's technical capabilities and the 
widespread use of typefaces 5 point and above in the printing of books 
between 1800 and I960, Cornell has concluded that 600 dpi scanning 
and the production of digital COM can serve as a viable means for 
capturing and preserving brittle books. Obviously, higher resolution 
scanning would offer some improvement, and 900 dpi production 
scanners may be practical before long. However, a guiding principle of 
the Cornell project was that the use of digital technology must result in 
products of sufficiently high quality and must be cost effective to be 
considered viable for the preservation of deteriorating library 
materials. Higher resolution scanners are currently available, but 
because the scanning time is very slow, the cost of image capture would 
be prohibitive. 



Costs and Availability of Digital COM Service Bureaus 

There is great potential in using digital technology to capture brittle 
material and to produce microfilm as a preservation backup. However, 
there are many issues that remain to be resolved before this becomes a 
practical alternative. The first issue concerns costs and the availability 
of service bureaus to record the computer output microfilm. To date, 
Cornell knows of only one company. Image Graphics, that is prepared 
to offer this service. According to the Marketing Manager for the 
company. Image Graphics' primary interest lies in marketing the 
Micrographics EBR System 3000, However, the company realizes that 
many institutions cannot afford to purchase their system and the 
company is thus offering a scanning service. As has been reported, 
Cornell experienced some concerns about the quality of the processing 
of the film and the use of the SO-219 film base. While both concerns 
could be adequately addressed by Image Graphics, or by having the 
processing subcontracted to another service bureau, there would be a 
period of adjustment and some costs associated with meeting 
ANSI/ AIIM standards. 4 



In a letter of December 5, 1991, Putnam Morgan, Marketing Manager for Image 
Graphics, quoted a price per book for the production of digital COM of S50 for the 
Kodak SO-219 and S25 for Kodak HQ Imagelink, with the price differential 
attributed to the cost of the film stock. Image Graphics would also have charged 
a one time set-up fee of $5,000 for "non-recurring engineering and coordination," 
and would have preferred to convert to the use of HQ Imagelink slowly over time. 
These charges were based on an average of 300 pages/book and 2,400 pages per 
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Conclusion 

The results of this study show that preservation quality microfilm can 
be produced from 600 dpi scanning and recommends that this option — 
including costs, vendor relations, and large scale conversion — 
continue to be explored; Cornell had initially considered converting a 
large number of digital books to microfilm in 1992 but decided against 
this for several reasons, the primary one being that there was a 
considerable delay in the development of the document structure 
software that will be required for merging microfilm targets with 
digital page images and for reel programming. The document editing 
software has recently become available from Xerox Corporation. 
Additional software programming would be required to extend it to 
cover the merging of digital images with the requisite targets and to 
create microfilm reel contents. In June 1993, Cornell submitted a 
proposal to the National Endowment for the Humanities to conduct an 
end-to-end demonstration project to create digital COM for 1500 
volumes. Support is also sought to assemble a technical advisory 
committee to address issues of quality, performance, and the 
development of draft guidelines for use by both research libraries and 
service bureaus. 

Definitions of quality must be developed for the use of digital 
technology as a preservation alternative. Micrographic standards are 
not totally transferable to digitally-produced microfilm, as the problems 
of using resolution test patterns attest. Because film density is a 
software-controlled variable in the production of digital COM, it may 
have less bearing on the image quality than it does in conventional 
microfilming. Also, micrographics standards are based on defining 
quality for the preservation copy (film) and not the use copy (paper). 
This is problematic given the degradation caused by converting from 
film to paper via a reader/printer. Fourth, unlike micrographics, 



roll of film for film lengths of 120 to 125'. A discount price was also given for a 
high volume of work (1,000 to 2,000 volumes/ month was S20; 2,000-3,000 
volumes/month was SI 8). The per frame charge, exclusive of the one time fee 
would be close to S. 17/ frame or S.083 per page. 

This cost may seem high, but it should be remembered that it reflects start up 
costs for a new service, which should come down over time. Moreover, the cost 
mav compare favorably to the cost of converting film to digital images. 
Estimates for this conversion run from several cents/frame to $1.50 per page for 
the creation of digital images from 35mm film. The range in price may reflect 
additional costs associated with indexing the digital images. In addition, much 
of the work in film-to-digital conversion is being done for banking and financial 
applications and is limited to 16 mm microfilm. A recent contract for converting 
35 mm film to 500 dpi digital images was $.35 /page. 
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quality standards for digitally captured material could vary according to 
the type of document being scanned (gray scale vs. binary) and the 
medium of the use copy (paper, film, on screen image). Quality 
standards for digital COM would assist in the knowledgeable use of 
digital technology as a means to preserve and make available 
deteriorating library materials. 

2. Image Capture for Illustrated Texts and Archival Material 

The Cornell/Xerox CLASS System was designed primarily to capture 
printed text, and as the Final Report for Phase I of Cornell's project 
indicates, the system can be used to produce a paper facsimile of 
comparable quality and lower cost than photocopy. From February 
through August 1992, scanning technicians experimented with the 
CLASS system's capability to capture a wide variety of hand printed or 
machine-produced illustrations, including line drawings, etchings, 
engravings, halftones, and continuous tone photographs that were 
present in fifty illustrated volumes published between 1850-1917. In 
September, the technicians began a six-month project to test the 
system's capabilitv to reproduce a wide array of archival material. This 
latter project was conducted on behalf of the eleven comprehensive 
research libraries in New York State and partially funded by a 
Cooperative Preservation Grant through the New York State Program 
for the Conservation and Preservation of Library Research Materials." 

Because no guidelines existed to assist the technicians in determining 
how best to treat illustrations or archival material, much of their time 
was spent experimenting with various combinations of scanner 
settings, including filters, screens, and tonal reproduction curves. The 
challenge posed by this material was further compounded by changes 
over time in printing processes and in the media used to create 
manuscripts (from quill pen to laser-printed documents). With older 
published material, for example, scanner settings would have to be 
adjusted several times through the course of scanning a book because 
of the variations in the ink applications on the printer's plates. 

Photo mechanical processes for illustrations included letter press, 
halftone, photogravure, and collotype etchings or engravings that 
utilized the cross hatching of straight lines to produce the illusion of 
tone. The process of creating halftones has changed over time. Early 
halftones from the mid-to-late nineteenth century varied from 
twentieth century halftones in the resolution of the printing screens 



The findings and representative samples of this second project of the Testbed are 
presented in the final report Preserving Archival Material Through Digital 
Technology. 
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used to determine the frequency of the dots. Early halftones employed 
a more widely spaced dot pattern, resulting in a coarser image. 

These changes required the technicians to apply different scanning 
settings for similar image types. The technicians also grappled with the 
question of fidelity: should their scanned representation be faithful to 
the process used to create the original or should they strive to capture 
the essence of the image? In the end, the answer was to try to capture 
the image, even if it meant representing a fine line etching as a 
halftone (see examples K & L). 

Binary scanning can reproduce many categories of printed illustrations 
and archival material as paper facsimiles that are in most cases 
superior or equal to the quality of photocopy versions. This was true 
for most-machine produced documents, including typescript, offset 
printing, and letterpress printing. Digital technology effectively 
captured most handwritten documents (including a broad range of ink 
and paper colors) and line art, including woodcuts, line drawings, 
graphs, and other simple edge-based representations. In some cases, 
the scanned version actually improved the legibility of the original 
document although a 600 dpi resolution is insufficient to capture all 
the detail present in some fine line etchings and engravings. Binary 
scanning also proved superior to photocopying for reproducing the 
depth of tonal range present in continuous tone and halftone images. 

The major distinction between binary scanning and photocopying for 
capturing photographs centers on a tradeoff between resolution and 
tonal reproduction. Photocopying can achieve extremely high 
resolution and can capture fine lines but it sacrifices detail in the 
highlights and the full range of shading. 

Perhaps digital image technology's greatest advantage over light lens 
processes is in capturing text cum image, where the illustration can be 
windowed and treated separately from the printed text in a manner 
that optimizes the capture of both. A description of the CLASS 
scanning system is located in Appendix I. 

The following chart and examples depict the capability of binary 
scanning in general and, the CLASS system in particular, to capture a 
wide array of illustrations commonly found in books published 
between 1850-19 17. 6 



6 



A similar description for archival material is included in Preserving Archival 
Material . 
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XEROX WG-40 SCANNER SETTINGS USED FOR ILLUSTRATIONS FOUND IN BOOKS 

PUBLISHED FROM 1850 TO 1917 



ILLUSTRATION TYPE 


SCANNER SETTINGS ; 


COMMENTS/EVALUATIONS 


LINEART/TEXT 

Line Drawings 

Text 

Charts/Diagrams 


Lineart with enhancement filter and ] 
threshold at 50-120, with the j 

majoritv falling in the 85-110 range. ; 
Threshold was increased for faint | 

text while thresholds as low as 45- j 
50 were used to capture material on j 
darkened or colored paper. | 

i 

j 


1 

These are generally the easiest categories 
of printed "material to capture. Because ! 

of the simplicity of the images, very little 
manipulation of the scanner settings 
were required and the settings were 
generally consistent throughout the : 

entire book. Image editing can be used to ' 
eliminate stains and to reconstruct I 

damaged or missing text. 

(Compare Examples C and D.) ! 


LINEART/TEXT 

Non- La tin Text 

I 

1 

i 

! 


Lineart with enhancement filter and j 
threshold at 50-120. j 

i 

i 

i i 


j 

The factors that made languages such as ! 
Hebrew, Chinese and Sanskrit more 
difficult to capture were the varying line , 
widths contained within each character, • 
poor quality printing, and the darkness 
of the pages caused bv their brittleness. 

The use of enhancement filters improved 
the capture of non-Latin text. 

{See Example A.) 


! 

LINEART/TEXT 

Mathematical Formulas 

: 

i 

! 

i 

t 

i 

i 

t 

t 


i 

Lineart with enhancement filter and 
1 threshold at 50-120. 

1 j 

1 

i 


! The relatively wide characters found 
next to the tiny superscripts found in 
| mathematical formulas posed more of a 
challenge than regular text. In addition, 
manv or the math books were hand 
written and the characters found in the 
formulas as well as the text varied 
widely in line width and density. The 
choice of threshold and filter affect one 
j another, particularly for capturing a 
J combination of fine line ana coarse line j 

material. 

(See Example B.) ; 


MANUAL PRINTS 

, Relief 

| Wood Engraving 

Line Block 
Intaglio 

Steel Engraving 


Lineart with enhancement filter and 
threshold at 50-100. 

1 

i 

i 


, Generally for all of the Manual Prints 
| that we encountered, if the engraving did 
! not include very fine detail, using 
i Lineart with an enhancement filter was 
j sufficient to capture the image. Howev er, 
; in cases where the details of the image 
; were created by very fine lines closely 
j spaced, we used the Halftone or Photo 
Mode to attempt to capture the detail. 
This resulted in capturing the detail but 
losing the "feel" of the original. 

(See Examples E, F, and G) 
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ILLUSTRATION TYPE 


1 

SCANNER SETTINGS 


COMMENTS/EVALUATIONS 


PROCESS PRINTS 

: Relief 

Halftone 


; High Frequency with descreening 
i filter, screen, and TRC map. 

i 


1 Halftone are created using a variety of 
screens, from the 80 line screens used in 
newspapers to the 150 line screens used 
in fine art books. This variation 
necessitates the use of different 
i descreening filters and TRC maps to 
capture the contrast and details of the 
I original. 

(See Example H.) 


PROCESS PRINTS 

Intaglio 

Mezzo Tvpe 
Photogravure 

Machine Printed 
Photogravure 


' Photo with screen and TRC map. 


Photo Mode captures the details and 
"feel" of the original adequately. The 
; system is able to capture process prints 
better than it can capture an original 
photograph because there are fewer 
tonal gradations in the photogravures. 
(See Examples I and J.) 


PROCESS PRINTS 

Planographic 

Photolithograph 

i Collotype 


Low Frequency using an 
' enhancement filter and Moire Away 
: with a threshold at 40-120. 

l 

\ 


| The originals may vary a great deal in 1 

j their density and detail wnich requires j 

j adjustment of the threshold and moire ! 

■ away to find the right balance for 
! capturing the variations in the original. 

! (Examples K and L show two ways of 
1 capturing a photolithograph.) 

1 (Example M represents a reproduction of j 
| a collotype.) 

i : 


! Photograph 


Photo with screen and TRC map. 

| 

i 

i 

i 

i 

i 

t 


| ! 

Photo Vlode allows for the capture of an ! 
! original photograph with most of the 1 

! detail found in the original. The 
: continuous tone image is changed into an 
image composed of a series of dots which 
. can onlv be black or white in binary- 
scanning. 

! (See Example N.) 

i (Examples O and P represent a CA.Y1I 
! scan and the final image selected.) 




BEST COPY AVAILABLE 
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EXAMPLE A Hebrew Text 

From a book published in 1900. 
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Scanned in Window Mode as a Text/Lineart Image Type. 



CHAPIXUE Vll. 



273 

est evident que Tequation (4) ne pourra pas etre sat is- 



laite. II s’ensuit que la circonference de rayon p ne ren- 
contre pas la courbe, et, par consequent, le point M est 



183. Lorsque Tinegalite (5) n’a pas lieu, les racines t 
de Inequation 



sont reelles, et elles peuvent etre representees par 



9 £tant toujours l’angle des axes, a et o des angles com- 
pris entre zero et ir; on peut alors mettre liquation (4) 
sous la forme 



0 sin, employees au numero precedent; nous ferons 
en outre, pour abreger Tecriture, 




= M X 2 sin*0, 

en convenant de prendre le radical avec le signe du pro- 
duit sin (0 — a) sin(0 — o). L’equation (4), divisee par p-, 
devient alors 



From a book published in 1900. 

Scanned in Window Mode as a Text/Lineart Image Type. 



un point isole. 




sin a sin 6 





p sm ( 9 — 4» ) 

Nous remplacerons h et k par leurs valeurs > 






R 




S. — Cate . dijf. 



EXAMPLE B Math Text 





'Y Oft?A\'£SQU£ /jV ITAL K 




vn 



of ».Iie 
..SO ft Pf T rd. v„I.. 



was little more man a superior 
i less of the \.iM and monstrous 




imagery peculiar to the style The best examples 
are to t? found at Pisa (Plate XXXVI), Florence 
(Fig. 1 52 ), and Montefiascone. 



EXAMPLE C Line Drawing 

From a book Published in 1896. 

Scanned in Window Mode as a Text/Lineart Image Type. 
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ROMANESQUE IN ITALY. 



3 1 7 



That of the north was little more than a superior 
sort of Lombard with less of the wild and monstrous 




Fig. 126.— Cathedral at Worms. 



imagery peculiar to the style. The best examples 
are to be found at Pisa (Plate XXXVI), Florence 
(Fig. 152), and Montefiascone. 



EXAMPLE D Image Wizard 

Modification of EXAMPLE C using Image Wizard. 

Scanned in Window Mode. 

Placed a window around the drawing to decrease the threshold from that used for the text. 

In Image Wizard bitmap editing package - Reconstucted words using existing text letters, and 
. "erased” the black areas created by scotch tape used to repair the original page. 








EXAMPLE E Manual Print - Relief - Wood Engraving 
From a book published in 1873. 

Scanned in Window Mode as a Text/Lineart Image Type. 
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JUPITER. 



Fig. 33. 




I. 1867, Nov. 27. (Dawes.) 
III. 1868, Mar. 2. (Huggins.) 
V. 1872, Feb. 2. (Gledhill.) 



II. 1859, Dec. 29: (Huggins.) 

IV. 1870, Jan. 23. (Gledhill.) 
VI. 1886, Feb. 25. (Denning.) 
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EXAMPLE F Manual Print - Relief - Line Block 

From a book published in 1891. | 

Scanned in Window Mode as a Text/Lineart Image Type 




THE MICROMETER. 



51 



another by spiral springs, thus bringing the inner heads of 
the screws against the ends of the box. These heads are 
often made square with the shaft of .the screw ; but they 
are much better made spherical, so as to fit into conical 
bearings at the ends of the box. A flat comb plate is 
placed over the moveable frames across the open centre, 
with a fine-toothed comb cut so as to form a chord to the 
circle of the field of view at right angles to the moveable 
webs. This comb plate carries two stout parallel wires 
(called position wires), about 12" apart, across the centre 
of the field, and at right angles to the moveable webs 
and parallel to the comb. The eyepieces are attached 




Fig. 3. (Parallel-wire Micrometer.)* 



outside the box to a sliding-piece, moved by a screw for 
centering over the webs in the direction of their motion. 
The webs, position wires, and comb should be clearly defined 
with a high power at the same time. The eyepieces should 
as much as possible slide into the same adapter, to save 
screwing and unscrewing. 

* One reading lens is removed to show the slow-motion clamp. 



EXAMPLE G Manual Print - Intaglio - Steel Engraving 
From a book published in 1879. 

Scanned in Window Mode as a Text/Lineart Image Type. 






EXAMPLE H Process Print - Relief - Halftone 
From a book published in 1914. 

Scanned in Window Mode as a Halftone Image Type with Text/Lineart Image Type used for 
the accompanying text. 







London sTeneoacopic co 



PHOTO* 6HOTY PC 



STANMORE OBSERVATORY. 

OUTSIDE VIEW 



EXAMPLE I Process Print - Intaglio - Mezzo type 
From a book published in 1891. 

Scanned in Window Mode as a Photo Image Type with Text/Lineart Image Type used for 
the accompanying text. 
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Process Print - Intaglio - Machine Primed Gravure 
From a book published in Japan in 1937. 

Scanned in Window Mode as a Photo Image Type with Text/Lineart Image Type used for 
the accompanying text. 




L 



47 




EXAMPLE K 



O 

ERLC 





Process Print - Planographic - Photolithograph 
From a book published in 1875. 

Scanned in Window Mode as a Low Frequency Image Type. 



48 



EXAMPLE L 
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Process Print - Planographic - Photolithograph 
From a book published in 1875. 



Scanned in Window Mode as a Photo Image Type with Text/Lineart Image Type used for 
the accompanying text. 






EXAMPLEM 



O 

ERLC 



Process Print - Planographic - Collotype 
From a book published in 1868. 

Scanned in Window Mode as a Low Frequency Image Type. 
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EXAMPLE N Process Print - Photograph - Attached 
From a book published in 1905. 

Original Photograph quite faded. 

Scanned in Window Mode as a Photo Image Type with Text/Lineart Image Type used for 
the accompanying text. 
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Scanned in CAMI Mode - Image Number 10 was chosen to be reproduced. 





S» Thompson . 



IZDUEAR (NIMROD) IN CONFLICT WITH A LION. 

FROM AN EARLY BABYLONIAN CYLINDER. 



EXAMPLE P Process Print - Photograph - Attached 
From a book published in 1876. 

Scanned in Window Mode as a CAMI-Photo Image Type with 
Text/Lineart Image Type used for the accompanying text. 







III. Expanding the Digital Library 



Originally the digital library was intended only as a way to preserve 
library materials already owned by Cornell University Library. The 
testbed project has served as the impetus for the digital library to begin 
an expansion that will include current and recent scientific journals, 
dissertations, research papers, and current textbooks. Major 
collaborative projects have been proposed to add scholarly material 
that will be used nationwide. Cornell has worked to provide an 
infrastructure for the testbed that is sufficiently flexible and expandable 
to enable the digital library to meet the immediate needs for storage of 
library material, and to support the additional services that are 
anticipated over the next three years. 

Several initiatives are underway that are testing the scaling of the 
digital library: 

• The University Licensing Project (TULIP), is a cooperative research 
project in which Elsevier Science Publishers and eleven 
universities, each with strength in the physical and engineering 
sciences, are testing systems for networked delivery and use of 
journals. Cornell University's objective in this project is to 
evaluate its capacity (a) to make scholarly articles in on-line form 
available and easily accessible over the campus network to Cornell 
faculty, students and staff, and (b) to exchange such publications 
over the national network with other institutions. Cornell is 
loading the electronic files for 30 materials science journals into its 
digital library, and making this information accessible to the Cornell 
community over the campus network. Over the course of three 
years, 100,000 pages/year will be scanned and added to the Cornell 
Digital Library. 

• Further growth of the digital library will occur as more publishers 
begin supplying material in electronic form rather than on paper. 
Already, some journals of interest to the research community are 
being published primarily, or only, in electronic form. For material 
with time value (such as scientific journals) electronic publishing 
offers obvious advantages. 

• The preservation scanning currently being done in-house will 
increasingly shift to service bureaus, following the model of 
preservation microfilming. This will increase vastly the rate of 
incremental growth of the digital library. It is estimated that 
Cornell owns over 1 million volumes in need of preservation 
through reformatting. 
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• The level and complexity of indexing will increase. Currently, 
searching of the digital library is provided through an SQL database 
running on the digital library server. This database provides 
limited searching capabilities by title, author, and document ID, and 
is intended primarily for known-item searching (that is, locating a 
document that is known to be in the digital library). The database 
processing needs vvill grow exponentially as the digital library 
grows. Projects such as TULIP are based on a many-to-one database 
so that journal articles can be accessed individually. This will mean 
an order of magnitude increase in database entries per document. 

• As the digital library becomes a tool used for actual research, the 
number of users and their level of sophistication will increase. Not 
only will there be more concurrent users, but each user is likely to 
open more documents simultaneously. 

• A variety of storage formats will be accommodated. Although the 
document architecture provides for different storage formats, at 
present the digital library software supports only TIFF bitmap 
images stored using the International Consultative Committee on 
Telegraph and Telephone (CCITT) Group 3 or Group 4 
compression. Other formats are being used to capture documents 
that will be added to the digital library, including gray scale images 
compressed using JPEG (the Joint Photographic Experts Group) or 
other compression schemes, color images produced through the 
Kodak Photo CD Technology, ASCII text produced by OCR from 
scanned images, PostScript, and SGML documents. Each format 
will require an additional software library to be added to the digital 
library server, and may require additional processes to be running 
simultaneously. 

• Database searching capabilities are being enhanced. At present, the 
Cornell University Library on-line catalog is the primary searching 
tool for all material, whether in paper or digital format. The digital 
library database is much more limited, and is intended to assist with 
locating and browsing documents that have been identified by 
searching the on-line catalog. The on-line catalog is being 
integrated with the digital library by adding Z39.50 7 database 
searching links to the digital library server and client software. 
Z39.50 is a protocol for information retrieval via networks. It 
defines the way in which a program on one computer can: 1) query 
the database of another computer, and 2) request the transfer of a 
particular record or group of records. In addition, there is a demand 
for full-text searching. The TULIP project includes searching of the 
abstracts of the articles, and some of the other planned projects will 



Z39.50 is an OSI (Open Systems Interconnection) application-layer protocol from 
the National Information Standards Organization. 
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focus on database searching techniques. To provide this capability 
will require additional storage and processing resources, particularly 
to generate the full-text indexes. 
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IV. Conclusion 



The work described in this report builds on the earlier work to study 
the feasibility of the use of digital technology to preserve deteriorating 
library material. In the course of three years Cornell has moved from 
prototyping and testing to implementing requirements for building a 
broad-based digital library accessible over world wide networks for 
printing and desktop computer use. 

Cornell University Library and Cornell Information Technologies have 
embarked on a major project to scale these efforts both in terms of the 
volume of material to be included but also in moving from a single 
institution project to a collaborative enterprise. The Making of 
America project aims at revolutionizing scholarly access to source 
material on the development of America's infrastructure — 
transportation, communications, and the built environment — between 
the years 1860-1960. 

In the course of this project to build a large digital library — numbering 
100,000 volume equivalents — in cooperation with other research 
institutions, Cornell and its partners will address key remaining issues 
that must be resolved before digital libraries become an everyday 
reality. These include — but are not confined to — defining standards for 
the quality of images captured, ensuring the long-term accessibility of 
the digital files and developing the infrastructure that will enable broad 
access to sources in network-connected digital repositories. 

In advancing the electronic preservation of and access to research 
library materials Cornell seeks to understand how on-line availability 
of thematically related materials will, over time, influence patterns of 
teaching, study, research, and scholarly publication. In the process, this 
project will assist research libraries redefine their missions in an 
electronic age and come to terms with the challenges they face. These 
include the spiraling cost of books and serial subscriptions; the rapid 
deterioration of collections caused by the rise of acid paper in 
publishing since the mid-nineteenth century; escalating costs and 
resistance to additional library buildings; and rising expectations by 
students and scholars for improved access to information. Digital 
image technology hold great promise to assist research libraries — and 
other purveyors of information — in addressing these challenges. 
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Appendix I — The CLASS Scanning System 



The CLASS scanning system utilizes a variety of software settings to 
optimize image capture. During the initial setup, technicians apply a 
combination of settings and preview the image on the screen. The 
following settings can affect image capture. 

1. Image Type 

This setting refers to the method that is used to capture the page, not to 
the type of original. The system can be configured to treat a page as: 

Lineart: used for text or line drawings (see examples A,B,C) 

Photo : applied to continuous tone originals or to very fine steel 
engravings that can not be captured by lineart (see examples G & N) 

High Frequency: applied to high frequency halftones (see example H). 
Halftone screening functions similarly to threshold (see below) in 
that it converts gray video to binary. However, in the case of 
halftone screening, a small two-dimensional array of thresholds is 
used to generate dots to represent continuous tones in the original. 
This allows the image to then be rescreened with a new halftone 
while avoiding moire (an undesirable pattern often introduced by 
overlaying halftone screens). 

Low Frequency: applied to stippled engravings in which light and 

shade are represented by employing dots or flicks instead of lines, 
(see example K) 

Mixed: allows for automatic segmentation of each page and is intended 
for scanning illustrated books with minimal manipulation by the 
technician. This option is still under development and the version 
available to the technicians during the Testbed Project did not prove 
satisfactory for scanning books with a variety of illustrations. 

Cami Patch: is used to create a test sheet with a variety of scanner 
setting options for illustrations and is most useful for technicians 
just learning the system, (see example O.) 

In addition to Image Type, operators also chose a variety of other 
setting options designed to enhance image capture. These include: 
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2. Threshold 

The CLASS scanner starts by capturing 8 bit gray video. This means 
that each pixel can have one of 256 possible values, each value 
representing a different density level on the original. Thresholding is a 
function that allows the gray video to be converted to binary, which 
has only two values per pixel, either black or white. This bilevel image 
is often referred to as a bitmap. 

The threshold setting is primarily used to establish contrast between 
text and background. Threshold is analogous to density in 
micrographics. By varying the threshold setting between 0 and 255, 
technicians can determine how light or dark the image will appear 
after it has been scanned. The threshold level has a direct effect on the 
shape and density of individual letters, and in order to determine the 
appropriate threshold, the technicians compare a 600 dpi version on 
the screen with the original page, using a 5X loupe to view the original. 



3. FILTER 



The digital filter is a powerful image processing function that has a 
variety of purposes. In the CLASS scanner, digital filters are used for 
both enhancement and halftone screening. Digital enhancement 
allows the edges and slope of a character to be emphasized, making it 
appear crisper. Line broadening or darkening often accompanies 
enhancement, requiring the threshold to be adjusted. Digital 
descreening is used to remove high frequency halftone dots from the 
original which are too small for the scanner to reliably reproduce. 

The choice of filters can determine how accurately the scanned image 
reflects the original. The CLASS scanner provides a variety of filter 
choices which are set for capturing published material. The current 
system offers a choice of three settings: filter 1, filter 2, or none. For 
each filter there are 9 different settings that can be applied. 



4. Tonal reproduction Curve (TRC) Maps 

Every scanner has an inherent tone reproduction capability which 
governs the relationship between the density (darkness) of the original 
and the resulting gray level as perceived by the scanner. TRC mapping 
is a function which allows the original gray levels from the scanner to 
be remapped (via a look-up table) to some new set of gray levels. TRC 
mapping is used only in conjunction with capturing photographs and 
halftones, and is most commonly used to adjust the brightness and 
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contrast of pictures when halftone screening is used. A “neutral" TRC 
is used to designate the unchanged tone reproduction; a “darker" TRC 
is used to increase the darkness of the original. 



5. Editing Software 

The CLASS software includes a bitmap editing program produced by 
Wang Laboratory, called Image Wizard. This software was used by the 
technicians to remove stains and foxing, to fill in incomplete 
characters, to reverse polarity, and to label scanned images. The 
program enabled the technicians to edit images at the “pixel" level, 
which proved very valuable in reconstructing damaged type. 
However, bit map editing is time-consuming and the program was 
used only sparingly, (compare examples C & D) 



6. Moire Away 

This image processing function masks the undesirable patterns 
introduced when moire occurs. Moire is an independent, usually 
shimmering pattern seen when two geometrically regular patterns 
(such as two sets of parallel lines or two halftone screens) are 
superimposed, especially at an acute angle. Although designed 
primarily for use with capturing halftones, the technicians found 
moire away useful when using the low frequency mode to capture 
stippled line engravings. The CLASS includes five different settings 
for moire away. 



7. Scan Mode 

There are four scan mode options available with the CLASS System: 
Quality Control Mode, Production Mode, Window Mode, and Cami 
Scan Mode. 

Quality Control Mode is used when setting up a book and allows the 
technicians to display the 600 dpi image on the screen in a view 
window so as to make adjustments to threshold and filter settings 
and to establish page trim. The technicians will scan a number of 
pages from a book in quality control mode to ensure that the 
settings they have chosen will capture most of the text in an 
acceptable manner. In this mode, the technicians may scan and 
view; rescan and view; save the image; and calibrate the scanner to 
determine whether the mechanical parts of the system are operating 
within tolerance levels. If the scanner is not calibrated, the image 
quality will deteriorate with subsequent scans. 
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Production Mode is used for scanning when the same settings can be 
maintained for multiple pages. In this mode, technicians can scan 
and save (but not view) an entire document. Because each image is 
not displayed on screen, scanning time is significantly reduced. 
Production mode works well for capturing text and line art where 
there is very little variation in printing quality throughout the 
document. 

Window Scan Mode is used mainly for capturing illustrated text. This 
mode allows the operator to treat three different areas of a page in 
an independent manner. The image is displayed in a view window 
and the technician can overlay rectangular windows over one or 
two portions of the page and may choose separate options for 
capturing the material contained in those windows. The material 
may be captured as high frequency, low frequency, lineart, photo, or 
as a cami patch. The two windows may also overlap, enabling the 
technicians to capture text within a halftone or photo. The 
limitations of the window scan mode are that only two thresholds, 
two filters, and one tonal reproduction cure (TRC) may be selected 
at a given time. This can pose a problem when a page contains two 
halftones, one of which is high contrast and the other is low 
contrast. Because only one TRC map can be chosen, it may not be 
possible to create the best reproduction of both halftones. As noted 
earlier. Xerox is developing a system of auto-segmentation that may 
ultimately obviate the need for using window scan mode. 

C ami Scan Mode is used to create a test sheet that provides 19 different 
versions of scanner settings for an image. The test sheet can then be 
used to determine scanner settings for capturing the illustration. 
During each of 19 scans, the system applies a different combination 
of scanner settings to the selected portion of the image, (see example 
0.) The cami scan mode is of most use to technicians learning the 
system and to experienced technicians when they encounter a new 
type of illustration. 



Appendix II — Document Architecture Description 



A digital library consists of a Image Delivery Server, networked storage, 
and a referencing database. A single digital library will contain one or 
more collections. Each collection will contain one or more documents. 

Conceptually, a single instance of a Digital Library is all the material 
and databases directly accessed by a single Image Delivery Server. If 
searches and retrieval of material must be mediated by another server 
(in order to verify authorization, for example), that material is 
considered to be in a separate Digital Library. Collections exist within a 
Digital Library for several reasons, including the traditional library 
reasons of grouping thematically-related material (such as a Witchcraft 
Collection or an Icelandic Collection) and information technology- 
related reasons (such as keeping licensed material with special 
authorization requirements grouped together). 

The referencing database allows searching for documents by author, 
title, and document ID. Searching is qualified by collection. Hence a 
database search could mean a search of the entire database (all 
collections within the Digital Library instance) or a single collection. 

In the current implementation, the referencing database is a relational 
SQL database, and each collection is represented by a table in the 
database. It is planned to migrate to Z39.50 database searching as the 
preferred method, as this protocol has been established as the standard 
for library applications. 

Authorization will be primarily collection-based, although the design 
will permit authorization checking at any level down to the individual 
file. It is intended that when a patron select a library or a collection for 
searching, he will immediately be informed if he is not authorized to 
access documents within that library or collection. A patron might not 
be authorized to access a particular document or component of a 
document, but in that case the notification would come only when the 
patron attempted to open the document or access the particular 
component. 

Each document consists of three components: the logical structure; the 
physical references; and the data files. 

The logical structure is a logical description of the document. 
Conceptually, a document is a tree, with the leaves being the data files 
(pages). At a minimum, all documents have a logical structure which 
lists the pages in the document and the order in which they appear. 
Usually, documents will have a more elaborate structure. The logical 
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structure relates the logical structure of a document to the physical 
references which make up the document. 

The physical references maps the lowest levels of the document's 
logical structure (the leaves of the tree) to the files that contain the data. 
Where there are multiple representations of a page, such as images at 
various resolutions, these are linked together in the physical references 
file. 

The data files contain the data making up a document. Any format can 
be accommodated: image files, ASCII text, PostScript, etc. However, 
one-to-one correspondence between data files for a given physical 
reference is assumed. That is, if there are multiple file types for a single 
page, these files should represent exactly the same information. 



Physical References file 

The Physical References file is the component of the document which 
relates logical structures (logical components of documents) to physical 
files. Document references, by which a document can be composed of 
all or part of other documents possibly residing on different servers, are 
handled in the Physical References file. 

A document may contain multiple document objects, each of which 
contains one or more data objects. When a document contains actual 
physical data (for example, it is created by scanning or importing 
images), a Master Document Object is created. When a document 
incorporates components of other documents, a Reference Document 
Object is created for each of the other documents. The Document 
Objects are numbered with internal reference numbers, which are 
included in the corresponding Data Object lines. 

Data Object lines include the Document Object number, the file 
reference number, and the file type. The Document Object number 
refers to a Document Object line, from which the library name, 
collection name, and document ID can be retrieved. The tuple 

<libraryID>+<collectionID>+<documentID>+<filetype>+<file reference> 

is guaranteed to locate a file. Each Data Object line refers to a single file; 
where multiple file types of a single document page exist, there will be 
multiple Data Object lines for that page. 

In the file, all Document Object lines will precede all Data Object lines 
for a given document. Document Object lines may be either grouped 
together at the beginning of the file, or may immediately precede the 
first Data Object line for the Document Object. Document Object lines 
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will appear in order by Document Object number. Data Object lines 
will appear in order by sequence number, NOT by Document Object 
number. 

The fields in the Physical References file are delimited by vertical bars. 



Document Object lines 

Field Description Comments 

1 Document Object number 0 = Master Document Object 

Other = Reference Document 
Object 

2 Library name Server name 

3 Collection name 

4 Document ID 8-digit number 

5 Author name 

6 Volume 

7 Title 

8 Edition 



Data Object lines 



Field 

1 

2 

3 

4 

5 



Description 

Document Object number 
Sequence number 
File reference 

Physical reference number 

File type 



Comments 

Corresponds to above 

Reference number used to 
locate file in filing system 
Corresponds to Logical 
Structure file 

0 = Structure file 

1 = TIFF 600dpi 

2 = TIFF thumbnail 

3 = ASCII version of page 
(i.e., OCR output) 

4 = ASCII notes 

5 - Other 
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Physical References file example 

0 | CORNELL | OLINLIB | 00000001 | Boole, Mary Everest | | Philosophy & Fun Of Algebra|| 



0 j 0 I 


00000001 1 


|0 


|0|| 


(File 


ref. 


#1 ^Physical ref. #1 


= Logical Structure file) 


0 1 1 1 


00000002 


|5| 


111 


(File 


ref . 


#2 = Phys. 


ref. #5 = 


600dpi 


TIFF 


image) 


0 j 2 I 


00000003 ! 


|5| 


|2|| 


(File 


ref . 


#3 = Phys. 


ref. #5 = 


100dpi 


TIFF 


image) 


0 1 3 | 


00000004 ; 


|6| 


111 1 


(File 


■ref. 


#4 = Phys. 


ref. #6 = 


600dpi 


TIFF 


image) 


0 | 4 | 


00000005 


|6 


|2|| 


(File 


ref. 


#5 = Phys. 


ref. #6 = 


100dpi 


TIFF 


image) 



Note that in the above, it is guaranteed that file references 2 and 3 are 
two different versions of the same page, as are file references 4 and 5. 



Logical Structure file 

The Logical Structure file is the component of the document structure 
which offers "views" of a document and links images together logically 
to define documents. The file is actually an unloaded tree; when a 
document is "opened", the file is read and the tree reconstructed. By 
convention, all Logical Structure files contain one logical structure 
"PAGES" which defines the document by listing the pages in the order 
in which they appeared in the original document. 



Document Structure lines 

Fiel Description 

d 

1 Parent structure 

number 

2 Sequence number 

3 Logical Structure name 

4 Structure number 

5 Logical Children 

6 Physical Children 

7 References 



Comments 



Structure is a child of... 



Label for this structure 
Corresponds to Physical Reference 
file 

# of logical children of this 
structure 

# of physical children of this 
structure 

# of references to this structure 
within this document 

(for how many structures is this 
a substructure) 
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Logical Structure file example 



| 0 | 0 | ROOT | 0 | 4 | 0 | 0 1 
j 0 j 1 1 PAGES 1 1 1 1 0 0 |*0 | 1 1 
|0 2 CONTENTS |2 I 22 Oil I 



1 1 1 1 1 Production note I 5 | 0 | 2 | 2 | 



2 I I 6 | 0 1 2 1 1 1 

3 | | 7 | 0 1 2 1 1 1 

4 | I 8 | 0 | 2 |'l | 

5 I I 9 I 0 1 2 1 1 1 



Structure 0, ROOT, has 4 logical children 
Str. 1, PAGES, has 100 logical children 
Str. 2, CONTENTS, has 22 logical children 
. . .has no physical children 

Str. 5 is child of structure 1 
...has a label "Production note" 

. . .has no logical children 
...has 2 physical references 
... is referenced twice in this document 
Str. 6 has no label 
Str. 7 has 2 physical references 
Str. 8 is referenced only here 
Str. 9 is the 5th sequential child of PAGES 



99 | 1 103 | 0 | 2 | 2 I 
100 I 1 104 I 0 I 2 I 2 I 
1 | Production note | 105 | 1 | 0 | 1 | 

2 I Title page | 106 1 1 1 0 1 1 1 

Table of contents | 107 | 2 | 0 | 1 | 

Chapter 1. From Arithmetic to Algebra | 108 | 6 | 0 | 1 | 



Str. 

Str 



105 is a child of str. 2 
, 106 has 1 logical child 



9 

10 
11 
12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 



2. The Making of Algebras | 109 | 4 | 0 | 1 j 

3. Simultaneous Problems | 110 | 4| 0 | 1 | 

4. Partial Solutions ... | 111 | 3 | 0 1 1 1 

5. Mathematical Certainty ... | 112 | 3 | 0 | 1 | 

6. The First Hebrew Algebra | 113 | 8 | 0 | 1 | 

7. How to Choose our Hypotheses | 114 | 9 | 0 | 1 | 

Chapter 8 . The Limits of the Teachers Function | 115 | 5 | 0 | 1 | 
Chapter 9. The Use of Sewing Cards | 116 | 4 | 0 | 1 | 

Chapter 10. The Story of a Working Hypothesis | 117 | 6 | 0 | 1 | 
Chapter 11. Macbeths Mistake | 118 | 6 | 0 | 1 | 

Chapter 12. Jacobs Ladder | 119 | 2 | 0 | 1 | 

Chapter 13. The Great X of the World 1 1 2 0 | 4 | 0 1 1 1 



Chapter 
Chapter 
Chapter 
8 | Chapter 
Chapter 
Chapter 



Chapter 14. Go Out of My Class-room) 121 1 4 | 0 1 1 1 
Chapter 15. . . . | 122 | 3 | 0 | 1 | 

Chapter 16. Infinity | 123 | 6 | 0 | 1 | 

Chapter 17. From Bondage to Freedom) 124 | 5 | 0 | l| 
Appendix 1 12 5 | 2 1 1 1 1 1 
( advertisements | 126 | 4 | 1 | 2 | 

105 1 1 1 Production note | 5 j 0 j 2 j 2 | 

106)1 (Title page | 11 | 0 | 2 j 2 j 
107 1 1 j 7 1 15 | 0 | 2 1 2 j 
107 j 2 1 8 j 16 j 0 j 2 j 2 j 



Str. 5 is a child of str. 105 
2nd reference to str. 11 



1 126 | 4 | 1 104 | 0 | 2 1 2 | 
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IMPLEMENTATION DETAILS 
The tuple: 

clibrary ID>+<collection ID>+<document ID>+<filetype>+<file reference> 

is guaranteed to locate a file. A file locator program will translate 
between this tuple and the fully-qualified path and file name in the 
underlying file system. 

While a library will always have a hierarchical nature corresponding to 
UNIX file systems, the order of the hierarchy will be flexible to 
accommodate optimization efforts. Each level of the hierarchy will 
have an INFO file that describes the order of the lower levels of the 
hierarchy. The file locator program will read these files as it navigates 
the directory structure of the file system when a library, collection, or 
document is opened. Two examples follow: 



Example 1. 



Hierarchy is LIBRARY, COLLECTION, DOCUMENT, FILETYPE. 



/clibrary name> 

LIBINFO.TXT Description of library 

/<col lection name> 

COL INFO . TXT Description of collection 

/ < do cumen t ID> 

DOC INFO . TXT 
LOGSTR. 000 
PHYSREF. 000 
/<filetypel> 

00001.TIF 

00002.TIF 



Description of document 
Logical structure file 
Physical reference file 



/<f iletype2> 

00001.TIF 

00002.TIF 



BEST COPY AVAILABLE 
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Example 2. 



Hierarchy is LIBRARY, FILETYPE, COLLECTION, DOCUMENT. 



/< library name> 

LIB INFO . TXT Description of library 

/<filetypel> 

/ccollection name> 

COLINFO . TXT 
/<document ID> 

DOCINFO.TXT 
LOGSTR. 000 
PHYSREF .000 
00001 .TIF 
00002 .TIF 



Description of collection 

Description of document 
Logical structure file 
Physical reference file 



/<f iletype2> 

/<col lection name> 

COLINFO . TXT 
/<document ID> 

DOCINFO.TXT 
LOGSTR. 000 
PHYSREF. 000 
00001 . TIF 
00002 .TIF 



Description of collection 

Description of document 
Logical structure file 
Physical reference file 



This implementation involves some redundancy, but it permits 
complete copies of a collection to be mounted on different file systems 
for performance considerations. In particular, the second scheme 
would facilitate storing all low-resolution images on high-speed 
magnetic disk for fast access, and all high-resolution images on slower, 
less expensive storage. This will also facilitate authorizing access to 
low-resolution images by other software systems (FTP, Gopher) while 
restricting access to high-resolution images. 



BEST COPY AVAILABLE 
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Creation of the Testbed involved assembling the core support for test 
activities. Core support requirements included staff, equipment, 
facilities, and the implementation of a methodology for testing and 
evaluating technologies. 



1. Staff 

Cornell recruited the necessary staff to begin work on the Testbed in 
September 1991. The staff to support the testbed includes scanning 
technicians and software programmers. Fortunately, scanning 
technicians who had been with Phase I continued to work on the 
Testbed project. In addition to their experience with preservation 
scanning acquired as a result of their work on this project, each of them 
brought relevant experience in printing and photography that has 
proven useful for digital imaging. Technical staff had to be found with 
the knowledge and experience to develop an image delivery server and 
client software for three computer platforms. Again Cornell was 
fortunate that the technical leader for Phase I of the Joint Study 
continued in that role for this phase. A search for qualified individuals 
to develop software for UNIX and Apple Macintosh environments 
resulted in two excellent candidates joining the project team. 



2. Equipment 

Cornell brought all of the equipment from the Phase I project, 
including Xerox scanning workstations and the DocuTech printer, to 
the Testbed. In addition, equipment to support the software developers 
and provide a production level of service to library patrons was 
secured. New hardware was provided by Sun Microsystems, Inc. and 
by Cornell Information Technologies. Sun Microsystems provided the 
image delivery server, consisting of a Sun SPARCstation and 
peripheral equipment, to enable researchers at other institutions to 
access digital files stored at Cornell; two publicly located workstations to 
offer library patrons the opportunity to browse and select books from 
the digital library within library buildings; and one workstation for 
software development. Cornell Information Technologies provided an 
Apple Macintosh Ilci computer and peripherals for use in the 
development of client software, an IBM PS/2 Model 95 XP for client 
development, and a Sun SPARCStation 1+. All hardware and 
associated software were in place by March 1992. 
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3. Facilities 

Cornell University Library provided dedicated work space for the 
scanning technicians and for the software development team. Room 
701 Olin Library was established as the scanning area for the Testbed. It 
was equipped with 2 telephone lines and enough telecommunications 
capability to support 3 Ethernet connected workstations and one 
Appletalk workstation. The software development team was located in 
Room 503 Olin library. This room was equipped with three telephones 
and extensive telecommunications capability. Eight Ethernet lines and 
one Appletalk line provide the connectivity for the workstations and 
development servers being used by this group. Sufficient 
communications are included to support some level of 
experimentation with new equipment as deemed necessary as part of 
Testbed activities. 



4. Set of Test Source Materials 

In order to meet the objectives of the Cornell Testbed, it was necessary 
to identify and prepare a consistent set of test source materials that 
could be used repeatedly to compare different scanning technologies. 
This group of materials, along with a set of procedures and protocols, 
was used to evaluate technologies in a consistent manner. 

A set of test documents has been chosen across the range of materials 
typically found in modern research libraries, including: 

a. Books, representing a variety of printing processes and languages, as 
well as those containing illustrations (photographs, halftones, 
photogravures, woodcuts, line art, original etchings, engravings, 
and color plates) that were selected from the first 1,000 volumes 
scanned during the Phase I. These books are used to test and retest a 
variety of image capture processes. For example, among the books 
included is The Steam Turbine: The Rede Lecture 1911, by Sir 

Charles A. Parsons, which contains halftones and line drawings 
embedded in text. This volume was scanned in September 1991 on 
the P2.8 scanning system using a mixed scanning mode; rescanned 
in March 1992 with software upgrades that resulted in improved 
image capture; and scanned a third time in April 1993 on the final 
prototype version of the CLASS scanning system. A digital 
computer output microfilm copy was created from the earlier 
scanned version in late 1991, and a conventional film copy was 
created from the original volume at the same time. 
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b. Serials, chosen from the fields of materials science and engineering 

that are being used in TULIP, Ihe University Licensing Program, a 
cooperative research project testing systems for networked delivery 
and use of journals, involving Elsevier Scientific Publishers and ten 
U.S. universities, including Cornell. 

c. Dissertations, from the School- of Engineering that are being used in 

the Cornell/Michigan/Penn State Dissertations Online Project. The 
Cornell portion of this project is referred to as DAISY, Dissertation 
Access over Internet SYstems. During the first year of DAISY (1993), 
the project will be a modest one, involving only a subset of 
dissertations, produced in paper format. Ultimately, the project 
could be expanded to cover all dissertations including those 
produced in multi-media formats. 

d. Archival and Manuscript Material, including holographs, 
architectural drawings, photographs, typescript and machine- 
printed documents, and maps. Items have been drawn from the 
holdings of Cornell's Division of Rare and Manuscript Collection 
and those kindly donated by the University of Rochester that were 
scanned during the Preserving Archival Material through Digital 
Technology Cooperative Project (1992/93), sponsored by Cornell on 
behalf of the Eleven Comprehensive Research Libraries in New 
York State. Some of this material has also been scanned using a 
LaCie Color Scanner to capture 8-bit gray scale at a variety of 
resolutions. 

e. Other material includes art work (water colors, pencil sketches) and 

field books from the Louis Agassiz Fuertes Papers; photographs and 
architectural drawings from the planning collection of John Nolen; 
and photographs from the University Archives that were selected 
for use in the Kodak Library Image Consortium (KLIC) Project, 
involving Cornell, University of Southern California, Eastman 
Kodak, and the Commission on Preservation and Access. KLIC is a 
cooperative investigation into the use of the Kodak Photo CD 
technology to preserve and make available on-screen historical 
photographs, paintings, and other images. A number of items 
scanned using the Xerox CLASS binary scanner have also been 
converted via the Kodak Photo CD technology. 

f. Technical Test Targets, including the AIIM Scanner Test Charts, as 

cited in ANSI/AIIM MS44-1988, "Recommended Practice for 
Quality Control of Image Scanners, are used to test a variety of 
scanning technologies and settings. Test charts include IEEE Std 
167A-1987 Facsimile Test Chart and AIIM Scanner Test Chart #2. 
Kodak gray wedges and color charts will be used where appropriate. 
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5. Criteria for Evaluation 

In addition to developing a control group of source material, a 
procedure for comparing digital scanning and storage systems was 
developed which occurs in two phases. The first phase involves 
screening to test new equipment against established benchmarks 
identified for the Xerox CLASS system. A decision is then made 
whether to proceed with a full evaluation. During this phase, a 
portion of the control set of material is processed that is most 
appropriate to the equipment under evaluation. 

The second phase of evaluation will be initiated for the most 
promising of systems and consists of a full test of the new technology. 
A large number of items will be digitized — the amount and type may 
vary — but the intent is to work with new documents in this phase and 
to test the particular capabilities of the system. Material will be 
scanned, stored, transmitted, retrieved and reformatted from the digital 
file. In this manner, the size of the digital library will continue to 
increase, more material will be preserved, and tests of new material can 
be conducted. 

Each new system will be tested and evaluated in terms of usability, 
quality, and production capabilities. In evaluating each system, the 
following will be considered: 

— Adherence to standards and open system connectivity 

— Efficiency or speed of capture 

— Costs of the process and the products 

— System usability, including user friendliness and availability 
of documentation 

— Equipment reliability and vendor support. 

Scanning technicians use standardized worksheets to record 
information on the items being scanned; time spent in set up, image 
capture, transmission, storage, and quality control; and system settings 
used. 

The digital files are reviewed for the quality of image capture, 
resolution, compression, file size, and adherence to standards and 
protocols. Where appropriate, the quality of output choices (paper, on- 
screen images, digital COM) will also be compared. 
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QUALITY 

Criteria for evaluating the quality of scanned images, in particular 
printed facsimiles, were developed as part of the Testbed Project. Image 
capture is affected by a number of factors, including the scanning 
process (binary, gray scale, color), the compression method, the 
equipment (including the scanner, the view station, and the printer), 
the condition of the equipment, the type of original (including media 
and support), and the physical condition or quality of the original. 8 

Material scanned to date in the Testbed falls into four main categories: 

a. line art (printed matter, holographs, typewritten material, 
blueprints, maps, etchings, various copying processes, 
including letterpress, thermofax, carbons, and photocopies). 

b. continuous tone (photographs, crayon and chalk drawings, 
acrylic, watercolor, and photographically reproduced 
facsimiles of artwork). 

c. halftones (reproductions, usually created from photographs, in 

which dots are used to represent continuous tones.) Color 
halftones use varying hues and combinations of the 
subtractive, or "process" colors to represent full continuous 
tone images. 

d. mixed (containing both text and continuous tone or halftone 
images). 

The following factors are considered in determining overall quality. 
These are divided into those affecting the quality of line art and text 
and those to consider in evaluating continuous tone and halftone 
images. If the item falls into the category of "mixed," all of the factors 
listed below should be considered. 




For a description of their effects on scanning, see Kenney, A. with M. Friedman 
and S. Poucher, Preserving archival material through digital technology. Final 
report . 1993. Available for $10 from Cornell University Library, Department of 
Preservation, 215 Olin Library, Ithaca, NY 14853. 
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Text/Line Art Characteristics for Evaluation 

1. Legibility and completeness. Is the text readable? Is it completely 
rendered? 

2. Darkness. How dark do the characters appear? Are the characters 
consistently dark across the page (bearing in mind the original)? In 
general, the darker the better. 

3. Contrast. Is there good contrast or differentiation between the text 
and the background? Is there even illumination across the image 
(again bearing in mind the original)? Is there a gray cast or streaking in 
the background? Is the image washed out? too dark? 

4. Edge Raggedness. This relates to the "smoothness" or "straightness" 
of edges along lines at very close inspection. Special attention should 
be paid to curved and diagonal lines on characters and line graphics, as 
compared to the original. Review should be made with a magnifying 
lens or eye loupe (5 to 10X magnification will suffice) and also by 
unaided visual inspection. The human eye is often forgiving of minor 
imperfections and will fill in a character to make it complete. 

5. Sharpness. This is a measure of the quality of the transition from 
black to white across an edge. A perfect line is black on one side and 
white on the other. A true straight line is difficult to duplicate, and the 
perceived quality of line reproductions can be affected. Using a 
magnifying glass, follow a line across the page and repeat this with the 
unaided eye. Does the line appear sharp or is it jagged in places? 

6. Line width fidelity. This relates to the system's ability to reproduce 
reliably the width of lines as defined by the original. Line widths for 
the original and the scanned copies should be compared, including 
samples ranging from very thin to very thick lines. Check to see if the 
smallest lower case "e" or "a" is closed in. Are serifs and fine detail 
fully rendered? An eye loupe with millimeter line markings will be 
useful in this evaluation. 

7. Character size fidelity. This relates to the system's ability to reproduce 
reliably the height and width of individual characters as defined by the 
original. This can best be determined by measuring the height of the 
lower case "x" or "e." Again, an eye loupe with millimeter line 
markings would be useful in this evaluation. 
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Pictoral/Graphic Characteristics for Evaluation of Continuous Tone and 
Halftone Images: 

1. Tone reproduction. This relates to the ability of a system to 
reproduce the range of tones in the original. The quality of the 
highlight and shadow regions often suffer from a reduced dynamic 
range in reproductions. This can be evident when detail present in the 
original is lost in the dark or light portions of the copy. It can also be 
seen in the rendering of distinctions presented by colored items. 

2. Uniformity. The system should produce uniformity of grays in the 
reproduction. Look for even gradations and the full medium values of 
halftones. Banding, streaking, and graininess are typical problems 
encountered in reproducing graphic materials. The digital versions of 
halftones in particular can exhibit a moire effect which will appear as a 
watered or wavy pattern. 

3. Detail. This relates to the system's capability to preserve any fine 
detail in an original. System evaluation will require a careful 
comparison of the copies to the original, and a magnifying glass will 
come in handy. 



Color Items 

In addition to the above categories, consideration must be given to 
whether the item is monochrome or color. For those systems that only 
reproduce black and white, color information is conveyed by shades of 
gray. The question then becomes how well the distinctions in shading 
represented by the color have been rendered. 

For color scanning, the evaluation should also take into consideration 
the fidelity of the hue, value, and intensity of different colors 
represented in the original. Fuller criteria for evaluating color 
scanning will be developed as the Kodak Photo CD project progresses. 
Modification of this evaluation process will result from practical 
experience and technological developments. 
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CHAPTER VII 



THE USES OF THE STARS 

Some persons habitually regard everything and everybody 
by the use that may be made of it or them, their idea of 
the word being very closely related to hard cash. If we were 
all to indulge in this eminently practical view, many of the 
things proudly considered by civilised humanity as among 
its highest and most lasting possessions would indeed be of 
little 41 use ” to us, as their value cannot be appraised by this 
rule of the market-place. Milton would rank far behind a 
smart man of business, if the activities of the two men were 
measured by the prosaic everyday scale some folk are so 
ready to apply. It is a difficult task to estimate brain-work 
according to its material value. Schiller received the absurd 
sum of £15 for his “William Tell,” but he might just as 
well have been presented with a castle on the Rhine, for it 
is enthusiasm and not cold calculation that plays the part 
of valuer in these cases. Nor must we forget that things of 
but little practical value may possess a very high ideal one. 
The delight in all that is lofty and sublime will act as a 
perfect tonic and recreation on the mind of man weary with 
his daily round of toil; it will strengthen him to battle with 
the strain of this life, and assist him to emerge successfully 
from the struggle. 

The Lesson of the Stars. — The majority of those to 
whom I have shown the wonders of the heavens in the silent 
observatory halls have usually been impressed and subdued 
by the majesty of the universe; but, naturally, there were 
some who could not leave the world’s dross behind, and who 
were anxious to ascertain the actual use of the stars to us. 
As a matter of fact, the stars are of great use to us, or, rather, 
astronomy has a very definite value, although in my own 
humble opinion their ideal influence ranks infinitely higher. 
Man, the insignificant parasite on the grain of sand called 

63 




77 



6 4 



ASTRONOMY FOR ALL 



O 

ERLC 



earth, floating in the infinite, is by nature a searcher in the 
quest of truth and eternity; he would not be satisfied if he 
could not gain a definite idea, based on scientific research, of 
those sparkling lights overhead. And even if he were still 
more advanced in the grim realities of life, he would be 
unworthy of the civilisation he revels in were he entirely 
ignorant of the problems nightly set to him by the starry 
heavens. The knowledge that the vast army of stars 
moves according to eternal, inexorable laws, that this very 
regularity guarantees its everlastingness, unwittingly in- 
fluences human actions and creations. The grandeur of the 
universe should be a powerful agent in eradicating that 
empty pride and class distinction from which the private and 
social contrasts that burden mankind generally arise. A 
careful study of our all-mother Nature will assist us to do 
away with self-righteousness and overbearing demeanour, 
and teach us dignity and modesty in every walk of life. 

Practical Value of Astronomy. — But it is of the useful- 
ness of celestial science I wish to speak. It has been stated 
several times in this volume that astronomy originally was 
a definition of time, a calendar science. Civilisation is more 
closely allied to time determination and definition than may 
be at first believed possible. Primitive man, whose daily 
work consisted in the protection and nourishment of his 
body, and who rested in his cave at night, was content 
with the setting and the rising of the sun to mark his day. 
It is quite probable that he, too, noticed an alteration in the 
position of the sun at different seasons, the appearance of 
the constellations at various times of the year ; and, guided 
by them, began the preparations necessary to guard him 
from the cold and the rains. The more the brain of man 
developed, the more complicated his needs became; the 
higher civilisation rose the more his conception of time 
increased, and the greater his interest in the course of days 
and years grew. Stonehenge, near Salisbury, that ancient 
circle of gigantic hewn boulders erected about three thousand 
years ago, is naught but a time and calendar definition of 
our forefathers. The huge stone blocks, 150 in all, are set 
up in two circles ; an altar stands in the centre, on which 
animals were probably sacrificed on certain days. On look- 
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Fig. 83.— Monument at Nuremberg 
to Peter Henlein (Hele), erected 
by the German Clockmakers’ 
Union. 




Fig. 85. — Old Household Clock. 




Fig. 84.— Drum-shaped Watch from the Marfels 
Collection. 
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Fig. 86. — Sun Clock on Chartres Cathedral. 
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ing straight over the altar^ the eye meets a stone column in 
the distance (Fig. 82), and on June 21 (summer’s beginning) 
the sun will be seen to rise exactly over its top. Other 
boulders very likely served a similar purpose and were used 
as sun-dials. 

Time'* Changes. — They have gone past recall, those 
good old times when the post-chaise jolted one’s bones over 
the country lanes and the postilion blew merry blasts on 
his horn. Past are the days when it took five minutes or so 
to strike a light or set the lamp burning, when telephone, 
telegraph, railway, motor-cars and all else that counts time 
by seconds were yet unknown. “Once upon a time ! ” We 
of the twentieth century, who grumble even at modern 
locomotion, pursue the very seconds. All this counting of 
minutes and fractions of minutes, rendered necessary by 
train and tram services, telephone and telegraph, which 
forces a more rapid mode of existence on us and permits 
us to accomplish more in an hour than was formerly 
possible in half a day, all this, I say, would be entirely 
impossible without the clock. 

The Invention of the Watch.— Since when did we carry 
this ticking register of the fleeting hours about with us? It 
was a German who presented the world with the first watch, 
an honest locksmith of Nuremberg named Peter Henlein 
(or Hele, as it is popularly abbreviated), who constructed 
the first clumsy iron pocket chronometer (Fig. 83). The 
following account of it appeared in the Cosmographia 
Pomponii Melae, published in Nuremberg in 1511 : “Every 
day finer things are being invented. Peter Hele, still a 
young man, has constructed a piece of work which excites 
the admiration of the most learned mathematicians. He 
shapes many-wheeled watches out of small bits of iron, 
which run without weights for forty hours, however they 
may be carried, in pocket or chemisette.” 

One of these earliest of pocket watches is contained in 
the celebrated Marfels collection (Fig. 84); it is made com- 
pletely of iron, and the weird works show that the watch- 
maker’s art was still in its earliest infancy. A very quaint 
feature is the pig’s bristle in the centre of the works to 
regulate the movement — replaced now by the tiny throbbing 
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spiral spring. This watch only has one hand, and a small 
knob is fixed above every figure bn the handsome dial of 
punched bronze to enable, the position of the hand to be 
told in the dark. The watch is not oval in shape, as is 
generally believed, but rather resembles a drum. The 
“Nuremberg egglets ” are a decided improvement on these 
clumsy things, which were more suited to saddlebags than 
to a waistcoat pocket. Watches in those early days were 
expensive articles, purchasable only by gentlemen of rank, 
and circumstantial details are given in old letters and 
chronicles of the purchase or presentation of such an 
“egglet.” Dandies for many years carried pretty little 
hour-glasses in their pockets. Mechanism in those days was 
hardly at its height; clocks and watches went very much 
as they pleased, and the lucky individuals of that period 
did not need to bother about fractions of a minute. Until 
the year 1700 watches only had an hour-hand, the minute 
hand was totally lacking, and it was impossible to tell 
time correctly within ten minutes or so; but in those days 
a quarter of an hour was of little importance. There was 
no boat-train to leave for Dover at 8.30 a.m., no electric car 
to be caught at a certain time, and the speed craze in all 
its shapes was yet unknown. 

Early Public Clock*.— The people who dragged these 
timekeepers about in their pockets had no little weight to 
carry, and yet what a vast step in the right direction they 
marked! Until they were invented, in 1511, all smaller 
towns and villages had to depend entirely on sun-dials. 
True, in 996 the French priest Gerbert, who later reigned 
as Pope Sylvester II., constructed the first clock with weights 
and wheels, and some very rich communities had one such 
erected on the church tower or the town hall ; but that 
was an extravagance only a few of the very largest towns 
could afford.* The oldest public clocks were those set up 
in 1314 on Caen Bridge, in 1340 at the Cluny Monastery, 
and the celebrated clock of Jacques de Dondi at Padua four 
years later. Instead of a pendulum these old clocks had 
a beam which swung horizontally and turned a spindle that 

* Later researches by F. M. Feldhaus have questioned the correctness of the 
attribntion to Gerbert of the invention of the wheel work clock. 
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gripped the wheels. A precise movement such as the clocks 
have to-day was utterly unknown. Fig. 85 gives a picture 
of an old household clock. 

Pendulum Clock*. — The con- 
nection of the wheels with the 
pendulum was another step on- 
ward. In 1639 Galileo proposed 
to use the regular beats of the 
pendulum for time determination 
and to keep it in motion by a 
wheel-work. The Dutch physi- 
cist Huygens constructed the first 
pendulum clock in 1657. These 
were naturally not intended for 
ordinary folk, and they did not 
need them either, as time was 
no object. Scholars and noble- 
men looked to hour-glasses and 
clepsydras (Fig. 87) for their time Fi *‘ 87_Cle ££ e c ‘ k ra - ° r Wa,er ' 
before pendulum clocks became 

popular. In both cases large glass bowls were used, 
filled either with fine dry sand or with water, which dripped 
away through a small opening into a smaller, notched vessel. 

As an equal 
amount of sand or 
water flowed away 
within a certain 
period, there was 
not much diffi- 
culty in ascertain- 
ing the time up 
to a quarter of 
an hour or so. 
Smaller instru- 
ments, such as 
are to-day used 
for egg - boilers, 
measured off 
still shorter 

Fig. 88. — Shadow Column in Ancient Rome. periods. 
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Sun - Dial*. — Good water- and sand-glasses were, how- 
ever, extremely expensive, and the lower classes pinned their 
faith to the sun clock (Fig. 86), which every paterfamilias 
of moderate skill could fashion for himself. The ancient 
civilised races knew of no other chronometer but this; in 
early Roman days there were special officials whose duty it 
was hourly to cry out the time as shown by the shadow 
columns (Fig. 88). Still farther back, people were content to 
tell the time by the direction of the sun or the length of the 
shadows thrown by trees and houses; ay, even their own 
shadows were used as clocks, for Pliny says : I beg thee 

tp honour my house when thy shadow will be six feet long. 
That most decidedly was the cheapest and least complicated 
movable clock in the world, aiways in action when its owner 
was in motion and the sun shone. I wonder what we should 
do with such chronometers to-day ! 

Astronomy and Time. — Few people ever give the fact 
a thought that time is “made ” by the astronomer, who even 
sends it out into the world. We all of us regulate our 
watches by the chief clock in the observatory, for all public 
clocks, those of churches, stations, post-offices, etc., are either 
directly or indirectly regulated according to the observatory 
clocks. 

Fearful confusion and endless railway accidents would 
result if station clocks, for instance, were not electro- 
magnetically regulated every day from the observatory head- 
quarters (Fig. 89). 

Mariner* and Time.— The astronomer sends his precise 
time determinations out into the sheer endless vasts of the 
oceans, for without this the ships would run risks too dread- 
ful to think of. Each vessel possesses several clocks of 
great precision and particular shape and mounting, which 
sender them independent of the tossing and rolling. These 
marine chronometers are veritable masterpieces of the watch- 
maker’s art (Fig. 91). Many of them only deviate five 
seconds within a fortnight, during which period they have 
twice crossed the ocean. Sailors find occasion to regulate 
their watches in the ports of all countries, as the harbour 
officials generally send up a time signal at midday, and time 
is checked accordingly by the officers entrusted with the regu- 
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Fig. 89 .- 'Electric Time Transmission Plant at the Observatory, Berlin. 







Fig. 90. — Time-ball. at Wilhelmshaven. 
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lation of the chronometers; this signal is usually given by a 
cannon, or the time-ball, a large ball attached by ropes to 
a signal tower, is dropped at a certain moment, say, 12 
o minutes o seconds. This time-ball (Fig. 90) is worked by 
electro-magnetism from an observatory. As soon as the 
hands of the observatory clock indicate midday an electric 
current is transmitted to the cables that keep the ball in place 




Fig. 91. — Marine Chronometer. 



and releases it. Any slight remissness in the motion of 
watch or clock can thus easily be rectified. 

The Astronomer the World’s Timekeeper. — We shall 
recognise the great importance of this later; at any rate, 
we have learnt that the astronomer keeps time for the world. 
This duty alone should suffice to establish a high practical 
value for his work. He in turn takes his time from the most 
marvellous clock of all, which through all history has gone 
with unfailing accuracy, never fluctuating for a second, with 
the rotating earth for its works and the star-set heavens for 
its dial. The transit circle, the pendulum clock and chrono- 
graph aid him in reading the time from the stars. 
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The astronomer also attends to naval and explorers 
chronometers, testing them in different temperatures, and 
working out tables in advance for their regulation in localities 
remote from civilisation. 

The Nautical Almanac.— The ship driven from its course 
by tremendous gales, the expedition forcing its way onward 
through virgin forests or in the ice and snow regions of the 
Poles, are one and all guided back to safety by the astro- 
nomer’s skill. A person familiar with observation and cal- 
culation will, if stranded at any desolate place in the world, 
soon be able to ascertain his bearings according to longitude 
and latitude to a very mile if an astronomical or nautical 
almanac, a chronometer and a sextant have been left him. 
All vessels carry such an almanac, in which the exact posi- 
tions of sun, moon and planets have been determined for 
every day and hour, years in advance. The exact position 
of the moon is given for every hour, of the sun and planets 
for every day, of the stars for every ten days. The stars 
act as milestones and road-signs to the sailor who for weeks 
at a stretch sees naught but sky and water, and the Nautical 
Almanac may well be termed his sky Baedeker. 

The Sextant and it* Uses.— A small instrument called 
a sextant (Fig. 92) is used for the determination of the dis- 
tance two stars are apart, or the distance between the moon 
and a star, or the altitude of the sun above the horizon. A 
small telescope is attached to this instrument; an adjustable 
mirror fixed to it is moved until the image of the star, whose 
distance is to be ascertained, covers that of the moon in the 
telescope. If the observation is correctly made, an indicator 
attached to the adjustable mirror will denote the exact 
angular distance between moon and star on the graduated 
limb of the sextant. 

Astronomical Determination of Position. — We will now 
imagine a vessel to have been carried right out of its course ; 
the captain can no longer tell his bearings or the direction 
in which he should continue. How can astronomy assist 
him in the circumstances? We will endeavour to elucidate 
this as simply as possible. 

The division of the earth into a net of latitudinal and 
longitudinal degrees renders it possible immediately to recog- 
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nise the position of any spot on the globe, when we know 
its geographical longitude and latitude. If a boat were 
wrecked 36 degrees 20 minutes (36° 20') north latitude and 
1 33 degrees 20 minutes east longitude a good map would at 
once inform us that this occurred near the Japanese coast, at 
the island of Oki-shima. As soon as a vessel has determined 
its whereabouts according to geographical longitude and 
latitude and can once more take up its proper course, it is 
saved from all the dangers connected with unknown sur- 




Figf. 92. — Mirror Sextant. 



roundings, for a vessel can only be steered with impunity 
if the logs and charts distinctly set out the difficulties 
and peculiarities of the route. The captain will therefore, 
weather being propitious, have to turn to the skies for 
guidance. He first of all determines the geographical lati- 
tude. It is a clear starry night. The Pole Star, as we know, 
is stationed at the celestial North Pole. Now, the farther 
away a place is from the equator the higher the celestial 
North Pole rises above the horizon. At the terrestrial North 
Pole the Pole Star would stand right over the head of the 
spectator, in the zenith, at the equator it would just graze 
the horizon. The celestial pole is always elevated as many 
degrees above the horizon as the terrestrial place is removed 
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degrees away from the earth’s equator. For instance, at 
Berlin the celestial North Pole, near which the Pole Star is 
situated, is at an altitude of 52# 0 ; Berlin is 52^° removed 
from the earth’s equator, so its geographical latitude is 52^°. 
The altitude of the Pole Star above the horizon is therefore 
measured and the latitude* of the ship’s position found. The 
sun serves a similar purpose in the day-time. At the instant 
the sun has reached its highest point above the horizon, when 
it is in the south at 12 o’clock midday, its distance from the 
water-line has to be determined with the aid of the sextant 




Fig. 93. — Measuring the Altitude of the Sun on Board Ship. 
The angle w marks the sun's height above the horizon. 



(see the angle w shown in Fig. 93). The distance of the sun 
from the celestial equator as stated in the Nautical Almanac 
is then looked up, and these two values soon determine the 
geographical latitude of the vessel’s position. Let us 
imagine it to be 51 0 10' north latitude. One part of our task 
is now completed, but the longitude has next to be deter- 
mined (this is the difference between the meridian of one’s 
standpoint and that of Greenwich, o, from which all calcu- 
lations of longitude start). This should not be difficult in 
fine weather if the ship’s chronometers are acting properly. 
The elevation of the sun above the horizon has to be 
measured at a distance from the meridian, when it is rising 
or falling rapidly. The chronometers set according to 
Greenwich time have immediately to be read off. The solu- 
tion of a spherical triangle gives the local time, which has 

* The Pole Star is in reality not quite at the celestial pole, but about 2| 
full-moon breadths away. 
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to be compared with the Greenwich time. Suppose there 
is a time difference of three hours eleven minutes between 
the ship’s position and Greenwich, where the day is more 
advanced. The vessel must therefore be west of Green- 
wich where the sun rises at a later hour. This time differ- 
ence helps to determine the longitude. The earth rotates on 
its axis once every twenty-four hours, the sun therefore 
sweeps across all the 360 meridians of the earth during 
this period ; every spot on the globe has its midday within 
these twenty-four hours, so it takes the sun the 360th part 
of twenty-four hours to pass from one meridian to another, 
and that works out at four minutes. Places with a time 
difference of four minutes are one degree apart. As a 
time difference of three hours and eleven minutes has been 
ascertained between the ship’s position and Greenwich, or 
47^ x 4 minutes, the vessel must be 47^ meridians to the 
west of Greenwich, or 47 0 45' west longitude. 

We now know where the boat is : 

51 0 io' north latitude. 

47° 45' west longitude. 

The map shows this to be in the northern part of the 
Atlantic Ocean, half a day’s journey east of Newfoundland, 
and if the vessel be bound for Halifax it will have to keep 
south south-west. 

Should by mischance the chronometer have been rendered 
useless, the exact time can be ascertained by observing the 
stars with the help of the Nautical Almanac, for every kind 
of astronomical occurrence which a mariner can see with a 
small telescope has been calculated in advance and noted as 
a guide to sailors (the moon passing stars, its place among 
the stars at various times, etc.). At the instant any one 
of these events occurs the sailor knows Greenwich time 
to be such and such. The stars, however, assist him at other 
times than those of danger only; the vessel’s time and place 
are determined daily by astronomical means, as all others 
would not be accurate enough and could not be fully de- 
pended on in these days of rapid locomotion. Travellers 
entrusting their lives to our modern floating palaces owe a 
very considerable part of their well-being to the observer 
measuring the transit of the stars in the meridian-chamber, 
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to the mathematician who compiles the tables in the Nautical 
Almanac. 

Astronomy and Aeronautics. — Latterly aeronauts and 
aviators have turned to the observatories for assistance. They 
frequently encounter grave difficulties when the mists and 
clouds beneath them make it impossible to study the chart, 
or when the balloon is carried away to districts of which they 
possess no maps. There are a few cases on record of aero- 
nauts who had lost their bearings being driven out to sea, 
where a watery grave awaited them. A special kind of sex- 
tant has been designed by Marcuse, of Berlin, which serves 
for the astronomical determination of position for balloon 
and airship. 

Aitronomy and History. — The statement that astronomy 
has proved invaluable to historians may sound odd at first, 
and yet its truth is undeniable. All noteworthy astronomical 
occurrences have been chronicled since the earliest days, 
generally in connection with some one or other important 
political or religious event. It is often of the utmost import- 
ance to historians to be able to state the exact date of any one 
event, and as an astronomer is able to trace celestial phe- 
nomena in the past, often to the very hour of their occurrence 
thousands of years ago, historians have been helped out of a 
difficulty on innumerable occasions. We know that a battle 
was fought between the Lydians and the Medes on the Halys 
in the sixth century B.C., and that a solar eclipse occurred 
during the fight. It was determined astronomically that this 
was most likely the total eclipse of the sun on May 28th, 
585 b.c., and that the great battle must therefore have 
been fought on that day. The ancient Chinese chronicle 
“Tshu-king ” is fraught with the deepest interest for his- 
torians and astronomers. All the dates in the volume refer, 
however, to the reign of the sovereign in whose time they 
were entered, as, for instance, “in the eighth year of the 
Emperor Fu-hi ” such and such an event occurred. This 
had to be converted into our time-reckoning to be of use 
to European historians. The “ Tshu-king ” tells of a great 
solar eclipse in the fifth year of the Emperor Tshun-khang’s 
reign, which had not been announced by the Court astrono- 
mers, and, as the people could not be notified, a terrible 
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THE FATE OF HI AND HO 

panic ensued throughout .the country. The two forgetful 
astronomers, Hi and Ho, were put to death by the Emperor s 
orders. 

There is a very celebrated work entitled the “Canon 
of Eclipses,” which was compiled by the great Austrian 
astronomer, Th. von Oppolzer, assisted by six other mathe- 
maticians, and in which all the sun and moon eclipses for 
centuries past and for the future up to a.d. 2163 have been 
calculated. This book, which is primarily intended for 
historical purposes, sets down the date of the eclipse which 
ended so sadly for Hi and Ho as the morning of October 
22nd, 2137 b.c. The fifth year of the Emperor Tshun- 
khang’s reign would therefore be the year 2137 of our 
reckoning, and the monarch ascended the throne in 2142 B.c. 
(Fig. 94 is a view of the old observatory at Peking, and in 
Fig. 95 is represented an armillary sphere, an ancient 
astronomical measuring instrument once used in the Chinese 
capital.) 

So astronomy helps us to grope our way about in the 
grey labyrinth of ages long past, and the flaring torch of 
science lights up events which appeared as distant and as 
inaccessible as the stars above. 



CHAPTER VIII 



ASTROLOGY AND SUPERSTITION 

Superstition, that extraordinarily rank weed, has struck 
strong roots in the very depths of human nature. Man, 
so little able after all to control the course of events and his 
own destiny, is again and again forced to recognise that he 
is but a toy in the hand of something so vast, so incompre- 
hensible and so unknowable that it cannot be conceived or 
included as a unit in life’s formula. Why, in a second the 
most carefully planned and constructed human creation — 
nay, even life itself — can be destroyed by a trifle of such 
insignificance that it almost seems ridiculous to contemplate, 
and yet we cannot fight against it even in thought. It is 
the story of a thunderclap in a bright sky all over again. 
And yet everything in this world has a firm, logical basis 
and occurs according to Nature’s unvarying and conse- 
quential laws, and, strictly speaking, there is no such thing 
as Chance . Yet, if it is difficult for men acquainted with 
the laws of logic and theory, nature and philosophy, to 
recognise even the main principles only of those forces and 
happenings that influence a thousandfold human life and 
work, how much more difficult it must be to the untaught 
man, to the less intelligent races that lived in past centuries, 
to attempt to grasp the rudiments of these relations ! 

Bast* of Superstition. — This inability forms the basis 
of all superstition. Secret forces and powers, good and evil 
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I. Introduction — The UNIX X Windows Digital Library Client 

The X windows digital library client provides access over the Internet 
to digital libraries that support the digital library image delivery server 
protocol. This document walks the reader through the actual screens 
of the X window digital library client and describes the functionality of 
each. 



II. LOGIN 



Upon invoking the X window digital library client the user must enter 
a login and enter 'return' or select the 'LOGIN' button in order to access 
the digital library. Once the user has logged in searching is now 
available. (Figures 1 & 2) 



III. Searching 

After completing the login sequence, the search screen is displayed 
(Figure 2) and the user may now issue search queries. 

To search, the user fills in any of the available search fields — 'Author', 
'Title', and 'Catalog Identifier' — and the client will return the best 
matches from the SQL database that contains the bibliographic 
information. The user simply selects the 'SEARCH' button or hits 
'return' to activate a search. The example search returned 38 
documents (Figure 3). A search without user specified information 
will return up to the maximum search replies allowed by the server. 



IV. document Browsing 



Once a search has returned a list of documents, the user can open the 
document by selecting the 'OPEN DOCUMENT' button. In the 
example, the document "The Popular History of the Civil War" was 
selected (Figure 4). When a document is opened a document structure 
window is displayed to assist the user in navigating through the 
document (Figure 5). Multiple views of a document may be 
represented bv the document structure, such as a listing the chapters of 
a book, or listing articles by author name, or by title. In the example the 
top level of the document structure contains "Pages" and "Contents". 
The Pages structure contains a linear list of all the pages that were 
scanned for this document. The Contents contains a more detailed 
description of the structure of the document. 

In our example we select the "Contents" structure and then expand this 
level by clicking 'Open/Enter Level' (Figure 6). The book is broken up 
into parts. Selecting the "Text" we now use 'Open & Display' to expand 
the next level (Figure 7). The "Text" level is divided into chapters 
(Figure-7). Finallv, we select "Chapter I" and select 'Open/Enter Level' 
to reach the pages of "Chapter I" (Figure 8). A dot to the left of a label 
in the structure window indicates there is an image associated with the 
label, a plus '+' sign indicates additional levels below. The 'Open & 
Display' button causes page 21 to be retreived and displayed (Figure 9). 
Using the 'Next Page' and 'Previous Page' buttons the user can view 
the pages of "Chapter I" (Figure 10). The document viewer also allows 
the user to display two pages side by side (Figure 11). 

V. PRINTING 

Printing is supported at the document level and the page level. The X 
client current supports printing to the DocuTech printer. The primary 
rational for allowing only DocuTech printing is to enforce copyright 
and billing procedures. At the search or structure windows (Figures 2 
& 5) the user may print the entire document by selecting the 'Print 
Document' button. The example shows how to select pages from 
Chapter I using the 'Select' button in the structure window (Figure 12). 

Selection of individual pages is accomplished with the select/deselect 
buttons on the structure window (Figure 12). The 'Print Selection' 
button initiates the print dialog box (Figure 13). The print window 
indicated the number of pages to be printed, the costs associated with 
printing and obtaining copyright permission, and the address to deliver 
the printed document. In order to print the entire document or 



selected pages, the user must acknowledge the billing information by 
selecting the 'Print' button at the bottom of the print window. 

VI. Seeking to Desired Page 

Next we return to the “Contents" level and enter the structure entitled 
"List of Illustrations" (Figure 14' & 15). Selecting 'Open & Display' we 
display the list of illustrations page (Figure 16). In the list of 
illustrations page let's say we are interested in a picture of John 
Calhoun on page 24. Now type 24 into the 'Label:' field at the top of the 
page viewer and select the 'GO TO' button (Figure 17). The viewer 
now searches for the image and displays page 24. We see the picture of 
John Calhoun (Figure 18). Notice that the structure of the window is 
updated to reflect the new location within the document structure tree. 
The 'GO TO' command's usefulness depends on the document 
structure labels entered by the scanning technicians. A detailed 
document structure is very easy to navigate, using either the 'Next 
Page', 'Previous Page', ‘Return/Exit Level' or 'Open/Enter Level' 
commands or the Go To Label command. 



VII. Miscellaneous Features 

The digital library X windows client allows the user to select the 
databases to search (Figure 19 & 20), and allows the user to open 
multiple documents (Figure 21). 
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Even the hoarse echoes o£ the Garmon's thunder end the 
dash of steel have suck to sleep ; the fretful murmurs ot seml- 
rtW td pashm and prejudice which succeeded the savage 
frenzy of murderous hate have even been hushed, and the 
timid te**— of laooncfliatfon have bem snpplaniad by an 
eager anxiety to proffer and respond warmly to fraternal great* 
Inge the dtittns of all motions throughout the now 
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a aerate, Impartial n arrati ve of one of the most important 
epte de a In the hfaksy of nyrfw" driliistiaB and the develop 
moot of human liberty. Tto term episode fa not inappropriate 
In connection with so stupendous an affair as the QtH War in 
America , s in ce it wu, despite its costly magnitude, whether 
Ilia ha rts of eakmlatian he that of mem money or those price* 
tea ofaimwito, human life and human blood, but one of the 
fax^denJaaf the conflict of opinions which began with the adop- 
tion of the Constitution of this Republic. 

For the greater part of a century the exigencies of National 
development wme such that the germs of d isa ff ection found no 
atm of popular fading to wamthem into life and action. They 
-wets latent, however, and as sorely aa the scrub oaks appear 



Qd 



■D 



Tot: 4 (s) 54 (m) Com:2 (s) 982 (m)Ext 0 (s) 632 (m)Dis: 0 (s) 73 (m) 



Figure 9 






Izj CORNELL UNIVERSITY DIGITAL LIBRARY - DOCUMENT BROWSING 



( CLOSE DOCUMENT} ( View v) 


(QunJ 


Title: The Popular History of the Civ 
Author: Herbert George B, 


— 


Catalog Identifier: 020701 0567B3AFA2AEEC420000000300000040 




(ftlSIHOft (GOTO) Label: A 




( PREVIOUS PACE} ( UNDO ) 





38 amaii op m owl win. 



modal encores had crowned the untiring efforts of a generation 
af unselfish, patriotic fan pulses. 

But aoough of generalisation. Thecansmaf the Civil War— 
call it RebdHtm if yon will, deem it Seoenion if 70a please — bad 
their origin m but one Hydra-headed element, commonly known 
BS State Sights. From tbe sorersign citizen to the sovereign 
Stats, vu an easy transition in popular or personal opinion ; 
from propoty in slaves to property interests in relation to tariff 
Jegfadatiaa, it was evm yet mars easy to tun, an* therefore, 
Jffiuagfarifot, ibe eadieet exemplar of the latent controversy, is 
entitled to but subjunctive rank among Che oohorta of dlamtU- 
factkro. It -was, however, toe to u ohatono of the entire matter, 
and copaeqoBirtly we mart begin onr history by rapidly r eco unt - 
ing the legislation wMtb lad up to the bold attempt of John C. 

QJhfB, of South Carolina, In 1883, to tap tbe integrity of the 
Union. 

An early as 1818, Otlhoan, when taunted by Saar Admiral 
O taw ait with the sham under which the aristore e cy of the 
Sooth, anpportad ahaointaly by ilyre labor, areamed to affiliate 
with democracy, haughtily retorted, in effect, that Mch assump- 
tion, or pretense, wae mere policy designed to aid the Sooth in 
ooBtroUlng the SspubMo ; that tbe compromises of tits past 
would not be repealed, and that any attempt to crush that policy 
or to abrogate its consequent power of control, would be met by 
a dimalntiOD af the compact af the States. 

Following ckosly upon the tariff agitation of 1818, a mere 
preliminary akfrmfsh, came the heated diacuretoae in 1830 an 
the slavery question, resulting in the Mimouri Campramlae, by 
which Miaoori was admitted as a slave-holding State in 1821. 
Sabeequent events proved that Calhoun's declaration af hoe- 
tfltty tow&rfe compromise measures was not a personal feeling 
merely, nor an unmeaning threat The fame vu merely 
postponed smd the a git a tion allayed until 1Mb 

The passage of the tariff act of ISM, which afforded protec- 
tion to the iron trade of Pennsylvania, the manuf ac t urer s of 
the Eastern States and the Northern and Western woo! and 
hmp interests, revived Southern hostility, and when, in 1888, 
after a bitter controversy lasting nearly a year, the tariff bill, 
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